- Home
- Conference Proceedings
- Qatar Foundation Annual Research Conference Proceedings
- Conference Proceeding
Qatar Foundation Annual Research Conference Proceedings Volume 2016 Issue 1
- Conference date: 22-23 Mar 2016
- Location: Qatar National Convention Center (QNCC), Doha, Qatar
- Volume number: 2016
- Published: 21 March 2016
501 - 550 of 656 results
-
-
Accelerating Data Synchronization between Smartphones and Tablets using PowerFolder in IEEE 802.11 Infrastructure-based Mesh Networks
Authors: Kalman Graffi and Andre IppischSmartphones are nowadays widely available and popular in first, second and third world countries. Nevertheless, their connectivity is limited to GPRS/UMTS/LTE connections with small monthly data plans even if the devices are nearby. Local communication options such as Bluetooth or NFC on the other hand only provide very poor data rates. Future applications envision smartphones and tablets as main working and leisure devices. Envisioning the upcoming trends on multimedia and working documents, file sizes are expected to grow further. In strong contrast to this, smartphone users are faced with very small data plans per month that they can use to communicate over the Internet. In maximum a few GB are available per month, while regularly 1 GB or much less mobile data traffic is available to smartphone users worldwide. Thus, the future data exchange between smartphones and tablets is very much limited through the current technology and business models. Even if the business models on wireless Internet plans would allow unlimited data exchange, still most regions of the world would be limited in connectivity and data exchange options. This striking limitation in the connectivity of the future's main communication devices, namely smartphones and tablets, is striking as with current solutions the data exchange between these devices is strongly handicapped. Taking into account that data exchange and synchronization often takes place to geographically close areas, to closeby friends or nearby colleagues, it is very strange that the local data exchange is so strongly limited due to missing or traffic-limited connectivity to the Internet.
In this presentation, we present a novel approach to enhance high speed data transfers between smartphones and/or tablets in local environments. In specific, we enable with our approach high transmission speeds of up to empirically tested 60 Mbit/s between nearby nodes in any environment. Our approach does not require any contract with Mobile Internet Providers and is not limited to any data plan restriction. We introduce a novel IEEE 802.11 Infrastructure Mode based mesh networking approach, which allows Android phones to create multihop mesh networks, despite the limitation of the WiFi standard. With this approach, wireless multihop networks can be built and are evaluated in empirical measurements, allowing the automatic synchronization and data exchange between smartphones and tablets in the near of up to 100 meters for a single hop and several wireless hops distance. The use cases are considering colleagues working in the same building that would like to exchange data on their smartphones and tablets. The data requests and offerings are thereby signaled in the multihop environment and dedicated wireless high-speed connections are established to transfer the data. Another use case is the local communication with mails, images and chat messages with friends nearby. As the communication approach also supports multicasting, several users can be addressed with single messages. Another use case is a wireless file sharing or data exchange service which allows users to specify their interest such as in action movies at a campus or information on sale promotions in a mall. While the user walks along, his smartphone picks up relevant data without user interaction. One relevant use case for Qatar is the area of smart cities, smart devices would also be able to pick up sensor data from buildings, local transport vehicles or citizens in a delay-tolerant fashion and thus using the GPS functionality of Android to deliver accurate sensor information tagged with the location and time of the sensor snapshot. Finally, this local, high-speed wireless data synchronization approach allows to implement a data-synchronization approach such as Dropbox, Box or PowerFolder between Smartphones and tablets, a novel feature on the market. To implement this idea, we are collaborating with PowerFolder, Germany's market leader in the field of data synchronization at universities and in academia. From the technology side, we observe that Android has a market share of around 80% worldwide as operating system of smartphones. While Android supports very well the connection to the Internet and cloud-based services, it only offers Bluetooth and NFC for local connectivity, both technologies only provide very low data rates. Wifi-Direct is also supported but requires similarly to Bluetooth lots of user interaction and thus does not scale to many contact partners. The IEEE 802.11 standard supports an ad hoc mode for local, high-speed wireless communication. Unfortunately, Android does not support the IEEE 802.11 ad hoc mode which would allow local high speed connections of up to 11 Mbit/s. Instead, we use the Infrastructure Mode of IEEE 802.11 to create ad hoc wireless mesh networks which support much higher data rates, of up to 54 Mbit/s and even more. Please note that WiFi Direct also claims to allow similar performance, but it heavily requires user interaction and failed in our empirical studies to connected more than 3–4 nodes reliably, we are aiming for hundreds of nodes interaction in a delay-tolerant manner without user interaction. Our aim is to 1. use only unrooted Android functionality, 2. to allow nodes to find themselves through the wireless medium, 3. to automatically connect and exchange signaling information without user interaction, 4. to decide on messages to exchange in a delay-tolerant manner supporting opportunistic networking, 5. to coordinate the transmission between various nodes in case that more than one node is in proximity, 6. allow single hop data transfers based on the signaled have and need information,7. allow multihop node discovery and thus 8. multihop routing of data packets. We reach our aim through using the unrooted Android API that allows apps to open a WiFi hotspot for Internet tethering. While we do not want to use the Internet, the service also allows clients to connect to the hotspot and to exchange messages with the hotspot. Using this approach, i.e. when an Android phone or tablet opens a hotspot, other Android devices running the App can join in without user interaction. For that the App on the joining node creates a list of known WiFi networks, which are consistently named: “P2P-Hotspot-[PublicKeyOfHotspotNode]”. The public key of the hotspot node is used as unique identifier of the node and as option to asymmetrically encrypt the communication to that node. With this Android API functionalities we are able to dynamically connect unrooted, casual Android devices that run our App. The IEEE 802.11 Infrastructure Mode that we are using brings the characteristics that the access point, in our case the hotspot, is engaged in any communication with attached clients. Clients can only communicate to each other through the hotspot. The clients all share the available bandwidth of the WiFi cell to communicate to the hotspot. Thus for a set of nodes, it is more advisable to have dedicated 1-to-1 connections with one node being the hotspot and the other one the client for fast high-speed transmission rather than having all nodes connected over the hotspot and sharing their bandwidth. In order to to support this, we differentiate between a dedicated signaling phase and a dedicated data transfer phase. Nodes are scanning the available WiFi networks and look for specific SSIDs, namely “P2P-Hotspot-[PublicKeyOfHotspotNode]”, these are available hotspots. As the App considers these networks as known and the access key is hardcoded, a direct connection can be established without user interaction. These steps fulfill requirements 1. and 2. Next, the node is signaling the data it contains for the various destination nodes also addressed to nodes with specific public keys as node identifiers. The nodes can also signal data that they generally share based on a keyword basis. The hotspot gathers these signaling information from its clients and creates an index on which node has what (have list) and wants what (want list). Based on this, the potential matches can be calculated and are communicated to the clients. In order to have direct high speed connections, these node release their connection to the hotspot and establish a new connection between each other: one as a hotspot and one as connected client. This step is very dynamic and allows closely located nodes to connect and exchange data based on their signaling to the previous hotspot. The freshly connected nodes also signal again their offerings and interest, to confirm the match. The following high speed 1-to-1 data transfer can reach up to 60 Mbit/s, which is much more than the 11 Mbit/s of the traditional IEEE 802.11 ad hoc mode. Once they are done with the transfer, they release their link and listen again on the presence of coordinating hotspots. If no hotspot is available, based on random timeouts, they themselves offer this role and create a hotspot. Hotspots are only actively waiting for clients for a short time and then trying to join in a hotspot themselves. The roles fluctuate constantly. Thus the network is constantly providing connection options to nearby nodes through dynamic hotspotting. Hotspots index the content of connected clients and coordinate ideally matching 1-to-1 pairs for high speed transfers. Thus, requirements 2.-6. are resolved. Finally, to implement requirements 7. and 8., the nodes also maintain the information on the hotspots they met and the nodes they exchanged data with, thus creating a time depending local view on the network. In addition to the signaling of the data files they offer, the nodes also signal to hotspots they meet their connectivity history. Hotspots on the other side share this information with newly joining clients, thus creating a virtual view on the topology of the network. Using this information, nodes can decide in which direction, i.e. over which nodes, to route a message, multihop routing is supported. Step-by-step and in a delay tolerant manner, data packets can be passed on until they reach the destination. This approach for opportunistic, multihop routing is fulfilling requirement 6 and 7. In close cooperation with PowerFolder, we implemented this approach and evaluated the feasibility and performance of the approach. PowerFolder is the leading data synchronization solution in the German market and allows to synchronize data between desktop PCs. With our extension, it is also possible to synchronize data across smartphone and tablets directly, thus saving mobile data traffic available in the data plan and on the other side supporting a fast and one of its class transmission speeds. We implemented our approach in an Android app and performed various tests on the connectivity speed, the transmission times and the reliability of the solution. Our measurements show that transmission speeds of up to 60 Mbit/s are reached in the closest proximity and even around 10 Mbit/s are obtainable in 100 meters distance. The multihop functionality is working reliably with a decrease in the transmission speed related to the distance and the number of hops. We also experimented with direct hop-wise and end-to-end transmission encryption and the resulting security gain. The encryption speed is reasonable for small files. Please note that using the public key infrastructure which we established, it is both possible to encrypt data hop-by-hop but as well end-to-end. Our approach presents a novel networking approach for the future connected usage of smartphones and tables in the information-based society. Addressing one of the grand challenges of Qatar, we are optimistic that the approach is also very suitable in Qatar to support the societies' shift towards better usability and more secure and higher bandwidth data exchange.
-
-
-
Mixed Hybrid Finite Element Formulation for Subsurface Flow and Transport
Authors: Ahmad S. Abushaikha, Denis V. Voskov and Hamdi A. TchelepiWe present a mixed hybrid finite element formulation for modelling subsurface flow and transport. The formulation is fully implicit in time and employs tetrahedron elements for the spatial discretization for the subsurface domain. It comprises all the main physics that dictate the flow behaviour for subsurface flow and transport, since it is developed on, and inherits them from the Automatic Differentiation General Purpose Research Simulator (AD-GPRS) of Stanford University Petroleum Research Institute (SUPRI-B).
Traditionally, the finite volume formulation is the method employed for the computation of fluid dynamics and reservoir simulation, thanks to its local conservation of mass and energy, and straight-forward implementation. However, it requires the use of structural grids and fails in handling high anisotropy inside the material properties of the domain. Also, the method is of a low computational order; the computed local solution in the gird is piecewise constant.
Here, we use the mixed hybrid finite element formulation which is of high order and can handle the high anisotropy for the material properties. It solves the momentum and mass balance equations simultaneously, hence the name mixed. This strongly coupled scheme facilitates the use of unstructured grids which are important for modelling the complex geometry of the subsurface reservoirs. The Automatic Differentiation library of AD-GPRS automatically differentiates the computational variables needed for the construction the Jacobian matrix which consists of the momentum and mass balance unknowns and any presence of wells.
We use two types of tetrahedron elements, Raviart Thomas (RT0) and Brezzi-Douglas-Marini (BDM1), low and high order respectively. The RT0 has one momentum equation per interface, and the BDM1 has three momentum equations per interface assuring second-order flux approximation. Therefore, when compared to the finite volume approach where the Jacobian consists of the mass balance and well unknowns only, the mixed hybrid formulation will eventually have a larger Jacobian (a one order of magnitude for the high order element) which is computationally expensive. However, none the less, the formulation converges numerically and physically better than the finite volume approach, as we show.
The full system is solved implicitly in time to account for the non-linear behaviour of the flow and transport at the subsurface level which is highly pressure, volume, and temperature (PVT) dependent. Therefore, we make use of the already robust PVT formulations in AD-GPRS. We present a carbon dioxide (CO2) sequestration case for the Johnson formation and discuss the numerical and computational results.
This is of crucial important for Qatar and the Middle East where effective reservoir modelling and management requires a robust representation of the flow and transport at the subsurface level using state of the art formulations.
In the literature, Wheeler et al. (2010) employ a multipoint flux mixed finite element approach to eliminate the momentum balance equation form the Jacobian and substitute it by the well-established multipoint flux approximation (MPFA) in the mass balance equation. Since it is based on MPFA, it will still suffer from convergence issues where high anisotropy is present in the material properties. They have recently expanded their work to compositional modelling of fluid, Singh & Wheeler (2014), however they solve the system sequentially in time, where in our method we solve the system fully implicit in time. Sun & Firoozabadi (2009) solve the pressure implicitly and the fluid properties explicitly in time by further decoupling the mass balance equations which decreases the physical representation of the non-linear behaviour of the flow and transport at the subsurface level.
-
-
-
Develop a Global Scalable Remote Laboratory Based on a Unified Framework
Authors: Hamid Parsaei, Ning Wang, Qianlong Lan, Xuemin Chen and Gangbing SongInformation technology has had a great impact on education and research by enabling additional teaching and researching strategies. According to the 2014 Sloan Survey of Online Learning, the number of students who have taken at least one online course increased to a new total of 7.1 million during the Fall 2013 semester. Remote laboratory technology has made a great progress in the arena of online learning. Internet remote controlled experiments were previously implemented based on the unified framework at UH and TSU. However, end users of the framework were required to install the LabVIEW plug-in in the web browsers to support online usage of remote experiments and the framework only supported the desktop and laptop. In order to resolve the plug-in issues, a novel unified framework is proposed. This unified framework is based on the Web 2.0 and HTML 5 technology. As shown in Fig. 1, there are three layer applications in the unified framework: the client application layer, the server application layer and the experiment control layer. The client web application is based on HyperText Markup Language (HTML), Cascading Style Sheets (CSS) and JQuery/JQuery-Mobile JavaScript libraries. The Mashup technology is used for user interface implementation. The client web application can be run in most of current popular browsers such as IE, Firefox, Chrome, Safari etc. The server application is based on Web Service technology and directly built on top of MySQL database, Apache web server engine and Node.js web server engine. The server application utilizes JSON and Socket.IO, which is developed based on web socket protocol to implement the real-time communication between the server application and the client-web application (Rai, R.,). The server application runs on LANMP (Linux/Aparche/Node.js/MySQL/PHP) server. The experiment control application is based on the LabVIEW, and uses Socket.IO for real time communication with server application. The remote laboratory based on the novel unified framework is able to run on many different devices, such as desktop and laptop PCs, iPad, Android Pad, smart phone, etc., without software plug-ins. However, there are still some challenges remaining for remote laboratory development as follows: 1) How to access remote experiments installed at different laboratories through a single webpage? 2) How to manage the remote experiments at the different laboratory? 3) How to resolve the challenges of system safety issues? In order to resolve these challenges, a new scalable global remote laboratory was implemented at Texas A&M University at Qatar (TAMUQ) based on the improved novel unified framework. To integrate three different remote laboratories at TAMUQ, UH and TSU, the new global scalable remote laboratory architecture was designed and developed at TAMUQ. The labs operate with a unified scheduler, a federated authentication module and user management system. Meanwhile, a scalable server was also setup at TAMUQ to support expansion of the remote laboratory. Figure 2 shows the global scalable remote laboratory architecture. In this scalable remote laboratory, the laboratory center server at TAMUQ will consist of a scalable server connected the other two lab center server in UH and TSU. All of three laboratory center servers are based on Linux/Node.js/Apache/MySQL/PHP (LNAMP) architecture. Socket.io, which is a new real-time communication technology, was used to manage the experimental data and other user information (such as User profile, Login information, etc.) transmission in this global platform. The center server at TAMUQ was designated as the center proxy server for the scalable remote laboratory. With this global platform, terminal users can use all of remote experiments of these three universities via one website. With the new global scalable remote laboratory based on the novel unified framework, the scalable scheduler and federated authentication solution was designed and implemented. At the same time, issues with security control and management of experiment access were solved through taking the full advantage of the functionalities offered by the MD5 encryption and decryption algorithm based security management engine. As shown in Fig. 3, the new user interface was also developed and integrated into the new scalable remote laboratory. With the new global scalable remote laboratory, future teaching and learning activities at TAMUQ, UH and TSU will be improved. Meanwhile, the improved unified framework will significantly benefit remote laboratory development in future as well.
-
-
-
The Assessment of Pedestrian-Vehicle Conflicts at Crosswalks Considering Sudden Pedestrian Speed Change Events
Authors: Wael Khaleel Alhajyaseen and Miho IryoIntroduction
Pedestrians are vulnerable road users. In Japan, more than one-third of the fatalities in traffic crashes are pedestrians and most accidents occur as the pedestrians cross a road. To evaluate alternative countermeasures effectively, recently traffic simulation is considered as one of the powerful decision support tools (Shahdah et al. 2015). A very important requirement for a reliable utilization of traffic simulations for the safety assessments is the proper representation of road user behaviors at potential conflict areas. Severe conflicts usually occur when road users fail to predict other users’ decisions and properly react to it. The widely varying behaviors and maneuvers of vehicles and pedestrians may lead to misunderstanding their decisions, which can result in severe conflicts. So far, most existing studies assume constant walk speeds for pedestrians and complete obedience to traffic rules when crossing roads as if they are at walkways. However, it is known that pedestrian behave differently at crosswalks compared to other walking facilities such as sidewalks and walkways. Pedestrians tend to walk faster at crosswalks (Montufar et al. 2007). Furthermore, their compliance to traffic signals vary by traffic conditions and other factors (Wang et al. 2011). Although many studies have analyzed pedestrian behavior including speed at crosswalks, most of them are based on the average crossing speed without considering the speed profile of the crossing process and the variations within. Iryo-Asano et al. (2015) observed from empirical data that pedestrians may suddenly and significantly change their speed on crosswalks as a reaction to surrounding conditions. Such speed changes cannot be predicted by drivers, which can lead to safety hazards. A study of the speed change maneuvers is critical for representing the potential collisions in the simulation systems and evaluating the probability and severity of collisions reasonably. The objective of this study is to quantitatively model the pedestrian speed change maneuvers and integrate the model into traffic simulation for assessing traffic safety.
Pedestrian speed change events as critical maneuver Figure 1 shows an observed pedestrian trajectory with sudden speed change. If there is a turning vehicle approaching the conflict area, the driver may behave based on his expectation of pedestrian arrival time to the conflict area. If the pedestrian suddenly changes his/her speed close to the conflict area, drivers will not be able to predict the new arrival time, which might lead to severe conflicts. Figure 1 demonstrates a real observed example of such speed change. The pedestrian suddenly increased his speed at the beginning of the conflict area, which yielded to the early arrival to the conflict area by 2.0 seconds (Tdif) than expected time assuming that the pedestrian will continue with his/her speed. A turning vehicle cannot predict this early arrival if it exists at the same time. Furthermore, these 2 seconds are large in terms of collision avoidance Iryo-Asano et al. (2015) showed that timings and locations of pedestrians speed changes mainly occur at the entrance to the pedestrian-vehicle conflict area and 2) when there is a large gap between pedestrian's current speed and his/her necessary speed to complete crossing before the end of pedestrian flashing green interval. In this study, further in-depth analysis is conducted by combining the pedestrian data and the information of approaching vehicle trajectories to identify the influencing factors on pedestrians’ sudden speed change events. The probability of speed change is quantitatively modeled as functions of the remaining green time, the remaining length to cross, the current walking speed and other related variables.
Simulation integration for safety assessment
The proposed pedestrian maneuver model is implemented into an integrated simulation model by combining it with a comprehensive turning vehicle maneuver model (Dang et al. 2012). The vehicle maneuver model is dedicated to represent probabilistic nature of drivers’ reaction to road geometry and surrounding road users in order to evaluate user behavior upon traffic safety. It produces speed profiles of turning vehicles considering the impacts of geometry (i.e. intersection angles, setback distance of the crosswalks) and the gap between the expected arrival time of the vehicle and that of the pedestrians at the conflict area. The proposed model allows us to study the dependencies and the interactions between pedestrians and turning vehicle s at crosswalks. Using the integrated traffic simulation, pedestrian-vehicle conflicts are generated and surrogate safety measures, such as Post Encroachment Time and the vehicle speeds at conflict points, are estimated. These measures are used to evaluate the probability and severity of pedestrian-vehicle conflicts. To verify the characteristics of the simulated conflicts, estimated and observed surrogate safety measures at a selected signalized crosswalk are compared through statistical tests.
Conclusions
The consideration of sudden speed change behavior of pedestrians in the simulation environment generates more reliable and realistic pedestrian maneuvers and turning vehicle trajectories, which enables more accurate assessment of pedestrian-vehicle conflicts. This enables the assessment of improvements in the signal control settings and the geometric layout of crosswalks towards safer and more efficient operations. Furthermore, the model is useful for the real-time hazardous conflict event detection, which can be applied to the vehicle safety assistance systems.
This research is supported by JSPS KAKENHI Grant No. 15H05534. The authors are grateful to Prof. Hideki Nakamura and Ms. Xin Zhang for providing video survey data.
References
Dang M.T., et al. (2012). Development of a Microscopic Traffic Simulation Model for Safety
Assessment at Signalized Intersections, Transportation Research Record, 2316, pp. 122?131.
Iryo-Asano, M., Alhajyaseen, W., Zhang, X. and Nakamura, H. (2015) Analysis of Pedestrian Speed Change Behavior at Signalized Crosswalks, 2015 Road Safety & Simulation International Conference, October 6th–8th, Orlando, USA.
Montufar, J., Arango, J., Porter, M., and Nakagawa, S. (2007), The Normal Walking Speed of Pedestrians and How Fast They Walk When Crossing The Street, Proceedings of the 86th Annual Meeting of the Transportation Research Board, Washington D. C., USA.
Shahdah U,. et al. (2015), Application of traffic microsimulation for evaluating safety performance of urban signalized intersections, Transportation Research Part C, 60, pp. 96?104.
Wang, W., Guo, H., Gao, Z., and Bubb, H. (2011) Individual Differences of Pedestrian Behaviour in Midblock Crosswalk and Intersection, International Journal of Accident worthiness, Vol. 16, No. 1, pp. 1–9.
-
-
-
Bi-Text Alignment of Movie Subtitles for English-Arabic Statistical Machine Translation
Authors: Fahad Ahmed Al-Obaidli and Stephen CoxWith the increasing demand for access to content in foreign languages in recent years, we have also seen a steady improvement in the quality of tools that can help bridge this gap. One such tool is Statistical Machine Translation (SMT), which learns automatically from real examples of human translations, without the need for manual intervention. Training such a system takes just a few days, sometimes even hours, but requires a lot of sentences aligned to their corresponding translations, a resource known as a bi-text.
Such bi-texts contain translations of written texts as they are typically derived from newswire, administrative, technical and legislation documents, e.g., from the EU and UN. However, with the widespread use of mobile phones and online conversation programs such as Skype as well as personal assistants such as Siri, there is a growing need for spoken language recognition, understanding, and translation. Unfortunately, most bi-texts are not very useful for training a spoken language SMT system as the language they cover is written, which differs from speech in style, formality, vocabulary choice, length of utterances, etc.
It turns out that there exists a growing community-generated source of spoken language translations, namely movie subtitles. These come in plain text in a common format in order to facilitate rendering the text segments accordingly. The dark side of subtitles is that they are usually created for pirated copies of copyright-protected movies. Yet, their use in research is an exploitation of a “positive side effect” of Internet movie piracy, which allows for easy creation of spoken bi-texts in a number of languages. This alignment typically relies on a key property of movie subtitles, namely the temporal indexing of subtitle segments, among with other features.
Due to the nature of movies, subtitles differ from other resources in several aspects: they are mostly transcriptions of movie dialogues that are often spontaneous speech, which contains a lot of slang, idiomatic expressions, and also fragmented spoken utterances, with repetitions, errors and corrections, rather than grammatical sentences; thus, this material is commonly summarised in the subtitles, rather than being literally transcribed. Since subtitles are user-generated, the translations are free, incomplete and dense (due to summarization and compression) and, therefore, reveal cultural differences. Degrees of rephrasing and compression vary across languages and also depend on subtitling traditions. Moreover, subtitles are created to be displayed in parallel to a movie in order to be linked to the movie's actual sound signal. Subtitles also arbitrarily include some meta information such as the movie title, year of release, genre, subtitle author/translator details and trailers. They may also contain visual translation, e.g., into a sign language. Certain versions of subtitles are especially compiled for the hearing-impaired to include extra information about non-spoken sounds that are either primary, e.g., coughing, or secondary background noises, e.g., soundtrack music, street noise, etc. This brings yet another challenge to the alignment process: the complex mappings caused by many deletions and insertions. Furthermore, subtitles must be short enough to fit the screen in a readable manner and are only shown for a short time period, which presents a new constraint to the alignment of different languages with different visual and linguistic features.
The languages a subtitle file is available for differ from one movie to another. Commonly, the Arabic language, even though spoken by more than 420 million people worldwide, and being the 5th most spoken language worldwide, has relatively scarce online presence. For example, according to Wikipedia's statistics of article counts, Arabic is ranked 23rd. Yet, Web traffic analytics shows that search queries for Arabic subtitles and traffic from the Arabic region are among the highest. This increase in demand for Arabic content is not surprising with the recent dramatic economic and socio-political shift in the Arab World. On another note, Arabic, as a Semitic language, has a complex morphology, which requires special handling when mapping it to another language and therefore poses a challenge for machine translation.
In this work, we look at movie subtitles as a unique source of bi-texts in an attempt to align as many translations of movies as possible in order to improve English to Arabic SMT. Translating from English into Arabic is an underexplored translation direction and, due to the morphological richness of Arabic among with other factors, yields significantly lower results compared to translating in the opposite direction (Arabic to English).
For our experiments, we collected pairs of English-Arabic subtitles for more than 29,000 movies/TV shows, which is a collection that is bigger than any preexisting subtitle data set. We designed a sequence of heuristics to eliminate the inherent noise that comes with the subtitles' source in order to yield good quality alignment. We used time overlap to align the subtitles by utilising the time information provided within the subtitle files and measuring the time overlap. This alignment approach is language-independent and outperforms other traditional approaches such as the length-based approach, which relies on segment boundaries to match translation segments, as segment boundaries differ from one language to another, e.g., because of the need to fit the text on the screen.
Our goal was to maximise the number of aligned sentence pairs while minimising the alignment errors. We evaluated our models relatively and also extrinsically, i.e., by measuring the quality of an SMT system that used this bi-text for training. We automatically evaluated our SMT systems using BLEU, a standard measure for machine translation evaluation. We also implemented an in-house Web application tool in order to crowd-source human judgments comparing the SMT baseline's output and our best-performing system's output.
Our experiments yielded bi-texts of varied size and relative quality, which we used to train an SMT system. Adding any of our bi-texts improved the baseline SMT system, which was trained on TED talks from the IWSLT 2013 competition. Ultimately, our best SMT system outperformed the baseline by about two BLEU points, which is a very significant improvement, clearly visible to humans; this was confirmed in manual evaluation. We hope that the resulting subtitles corpus, the largest collected so far (about 82 million words), will facilitate research in spoken language SMT.
-
-
-
A Centralized System Approach to Indoor Navigation for the Visually Impaired
Authors: Alauddin Yousif Al-Omary, Hussain M. Al-Rizzo and Haider M. AlSabaghPeople who are Blind or Visually Impaired (BVI) have one goal in common: navigate through unfamiliar indoor environments without the intervention of a human guide. The number of blind people in the world is not accurately known at the present, however, based on the 2010 Global Data on Visual Impairments, World Health Organization, approximately 285 million people are estimated to be visually impaired worldwide: 39 million are blind and 246 have low vision, 90% in developing countries, with 82% of blind people aged 50 and above. Available extrapolated statistics about blindness in some countries in the Middle East show ∼102,618 in Iraq, ∼5,358 in Gaza strip, ∼22,692 in Jordan, ∼9,129 in Kuwait, ∼104,321 in Saudi Arabia, and ∼10,207 in the United Arab Emirates. These statistics reveal the importance of developing a useful, accurate, and easy to use navigation system to help this large population of disabled people in their everyday lives. Various commercial products are available to navigate BVI people in outdoor environments based on the Global Positioning System (GPS) where the receiver must has a clear view of the sky. Indoor geo-location, on the other hand, is much more challenging because objects surrounding the user can block or interfere with the GPS signal.
In this paper, we present a centralized wireless indoor navigation system to aid the BVI. The system is designed not only to accurately locate, track, and navigate the user, but to also find the safest travel path and easily communicate with the BVI. A centralized approach is adopted because of the lack of research in this area. Some proposed navigation systems require users to inconveniently carry heavy navigation devices; some require administrators to install a complex network of sensors throughout a building; and others are simply impractical in practice. The system consists of four major components: 1) Wireless Positioning Subsystem, 2) Visual Indoor Modeling Interface, 3) Guidance and Navigation Subsystem, and 4) Path-Finding Subsystem. The system is designed not only to accurately locate, track, and navigate the user, but to also find the safest travel path and easily communicate with the BVI.
A significant part of the navigation system is the virtual modeling of the building and the design of the path-finding algorithms, which will be the main focus of this research. Ultimately, the proposed system provides the design and building blocks for a fully functional package that can be used to build a complete centralized indoor navigation system, from creating the virtual models for buildings to tracking and interacting with BVI users over the network.
-
-
-
Evaluation of Big Data Privacy and Accuracy Issues
Authors: Reem Bashir and Abdelhamid Abdelhadi MansorNowadays a lot of massive data is stored and typically, the data itself contains a lot of non-trivial but useful information. Data mining techniques can be used to discover this information which can help the companies for decision-making. However, in real life applications, data is massive and is stored over distributed sites. One of my major research topics is to protect privacy over this kind of data. Previously, the important characteristics, issues and challenges related to management of the large amount of data has been explored. Various open source data analytics frameworks that deal with large amount of Data analytics workloads have been discussed. Comparative study between the given frameworks and suitability of the same has been proposed. Digital universe is flooded with huge amount of data generated by number of users worldwide. These data are of diverse in nature, come from various sources and in many forms. To keep with the desire to store and analyze ever larger volumes of complex data, relational databases vendors have delivered specialized analytical platforms that come in many shapes and sizes from software only to analytical services that run in third party hosted environments. In addition new technologies have emerged to address exploding volumes of complex data, including web traffic, social media content and machine generated data including sensor data, global positioning system data.
Big data is defined as large amount of data which requires new technologies and architectures so that it becomes possible to extract value from it by capturing and analysis process. Due to such large size of data it becomes very difficult to perform effective analysis using the existing traditional techniques. Big data has become a prominent research field, especially when it comes to decision making and data analysis. However, Big data due to its various properties like volume, velocity, variety, variability, value and complexity put forward many challenges. Since Big data is a recent upcoming technology in the market which can bring huge benefits to the business organizations, it becomes necessary that various challenges and issues associated in bringing and adapting to this technology are brought into light. Another challenge is that data collection may not have enough accuracy which will lead to a non-consistent analysis that can critically affect the decision based on this analysis. Moreover, it is clearly apparent that organizations need to employ data-driven decision making to gain competitive advantage. Processing, integrating and interacting with more data should make it better data, providing both more panoramic and more granular views to aid strategic decision making. This is made possible via Big Data exploiting affordable and usable Computational and Storage Resources. Many offerings are based on the Map-Reduce and Hadoop paradigms and most focus solely on the analytical side. Nonetheless, in many respects it remains unclear what Big Data actually is; current offerings appear as isolated silos that are difficult to integrate and/or make it difficult to better utilize existing data and systems. Since data is growing at a huge speed making it difficult to handle such large amount of data (Exabyte). The main difficulty in handling such large amount of data is because that the volume is increasing rapidly in comparison to the computing resources. The Big data term which is being used now a days is kind of misnomer as it points out only the size of the data not putting too much of attention to its other existing properties.
If data is to be used to make accurate decisions in time it becomes necessary that it should be available in accurate, complete and timely manner. This makes the data management and governance process bit complex adding the necessity to make Data open and make it available to government agencies in standardized manner with standardized APIs, metadata and formats thus leading to better decision making, business intelligence and productivity improvements.
This paper presents a discussion and evaluation for the most prominent techniques used in the processes of data collection and analysis in order to identify the privacy defects in them that affects the accuracy of big data. Depending on the results of this analysis, recommendations were provided for improving data collection and analysis techniques that will help to avoid if not all then most of the problems facing the use of big data in decision making. Keywords: Big Data, Big Data Challenges, Big Data Accuracy, Big Data Collection, Big Data Analytics.
-
-
-
Video Demo of LiveAR: Real-Time Human Action Recognition over Live Video Streams
By Yin YangWe propose to present a video demonstration of LiveAR at the ARC'16 conference. For this purpose, we have prepared three demo videos, which can be found in the submission files. These video demos show the effectiveness and efficiency of LiveAR running on video streams containing a diverse set of human actions. Additionally, the demo also exhibits important system performance parameters such as latency and resource usage.
LiveAR is a novel system for recognizing human actions, such as running and fighting, in a video stream in real time, backed by a massively-parallel processing (MPP) platform. Although action recognition is a well-studied topic in computer vision, so far most attention has been devoted to improving accuracy, rather than efficiency. To our knowledge, LiveAR is the first that achieves real-time efficiency in action recognition, which can be a key enabler in many important applications, e.g., video surveillance and monitoring over critical infrastructure such as water reservoirs. LiveAR is based on a state-of-the-art method for offline action recognition which obtains high accuracy; its main innovation is to adapt this base solution to run on an elastic MPP platform to achieve real-time speed at an affordable cost.
The main objectives in the design of LiveAR are to (i) minimize redundant computations, (ii) reduce communication costs between nodes in the cloud, (iii) allow a high degree of parallelism and (iv) enable dynamic node additions and removals to match the current workload. LiveAR is based on an enhanced version of Apache Storm. Each video manipulation operation is implemented as a bolt (i.e., logical operator) executed by multiple nodes, while the input frame arrive at the system via a spout (i.e., streaming source). The output of the system is presented on screen using FFmpeg.
Next we briefly explain the main operations in LiveAR. The dense point extraction bolt is a first step for video processing, which has two input streams: the input video frame and the current trajectories. The output of this operator consists of dense points sampled in the video frame that are not already on any of the current trajectories. In particular, LiveAR partitions the frame into different regions, and assigns one region to a dense point evaluator, each running in a separate thread. Then, the sampled coordinates are grouped according to the partitioning, and routed to the corresponding dense point evaluator. Meanwhile, coordinates on current trajectories are similarly grouped by a point dispatcher, and routed accordingly. Such partitioning and routing minimizes network transmissions as each node is only fed the pixels and trajectory points it needs.
The optic flow generation operator is executed by multiple nodes in parallel similarly to the dense point extractor. An additional challenge here is that the generation of optic flows involves (i) comparing two frames at consecutive time instances and (ii) multiple pixels in determining the value of the flow in each coordinate. (i) means that the operator is stateful, i.e., each node must store the previous frame and compare with the current one. Hence, node additions and removals (necessary for elasticity) become non-trivial as a new node does not immediately possess the necessary states to work (i.e., pixels on the previous frame) on its inputs. Regarding (ii), each node cannot simply handle a region in the frame, as is the case in the dense point extractor, as the computation at one coordinates relies on the surrounding pixels. Our solution in LiveAR is to split the frame into overlapping patches; each patch contains a partition of the frame, as well as the pixels surrounding the partition. This design effectively reduces the amount of network transmissions, thus improving system scalability.
Lastly, the trajectory tracking operator involves three inputs: the current trajectories, the dense points detected from the input frame, and the optic flows of the input frame. The main idea of this operator is to “grow” a trajectory, either an existing one or a new one starting at a dense point, by adding one more coordinate computed from the optic flow. Note that it is possible that the optic flow indicates that there is no more coordinate on this trajectory in the input frame, ending the trajectory. The parallelization of this operator is similar to that of the dense point extractor, except that each node is assigned trajectories rather than pixels and coordinates. Grouping of the trajectories is performed according to their last coordinates (or the newly identified dense points for new trajectories).
-
-
-
FPGA Based Image Processing Algorithms (Digital Image Enhancement Techniques) Using Xilinx System Generator
More LessFPGAs has many significant features that serves as a platform for processing real time algorithm. It gives substantially higher performance over programmable Digital Signal Processor (DSPs) and microprocessor. At present, the use of FPGA in research and development of applied digital systems for specific tasks are increasing. This is due to the advantages FPGAs has over other programmable devices. These advantages are high clock frequency, high operations per second, code portability, code libraries reusability, low cost, parallel processing, Capability of interacting with high or low interfaces, security and Intellectual Property (IP).
This paper presents concept of hardware digital image processing algorithms using field programmable gate array (FPGA). It focus on implementation an efficient architecture for image processing algorithms like image enhancement (point processing techniques (by using fewest possible System Generator Blocks. In this paper, Modern approach of ‘Xilinx System Generator’ (XSG) is used for system modeling and FPGA programming. Xilinx System Generator is a tool of matlab that generates bit stream file (*.bit), Netlist, timing and power analysis. Performance of these architectures implemented in FPGA card XUPV5-LX110T.
-
-
-
Alice-based Computing Curriculum for Middle Schools
Authors: Saquib Razak, Huda Gedawy, Don Slater and Wanda DannAlice is a visualization software for introducing computational thinking and programming concepts in the context of creating 3D animations. Our research aims to introduce computational thinking and problem solving skills in the middle schools in Qatar. To make this aim accessible, we have adapted the Alice software for a conservative Middle Eastern culture, developed curricular materials, and provided professional development workshops for teachers and students in the Middle East. There is a trend for countries, to evaluate curriculum from other cultures, and then try to bring the successful curriculum to their own school systems. This culture is a result of societies beginning to realize the importance of education and knowledge. Qatar's efforts towards building knowledge-based society and upgrading their higher education infrastructure are proofs of this realization. The challenge is to recognize that although a strong curriculum is necessary, simply porting a successful curriculum to a different environment is not sufficient to guarantee success. Here we share our attempt to take a tool with associated curriculum that has been very successful in several countries in the West, and apply it in an environment with very different cultures and social values.
The Alice ME project is targeted at middle school (grades 6–8) teachers and students in the Middle East. The overall goal of the project is to adapt the Alice 2 software to the local cultures, develop new instructional materials appropriate for local systems, and test the effectiveness of the Alice approach at the middle school level. The “Alice approach” – using program visualization to teach/learn analytic, logical, and computational thinking, problem solving skills and fundamental programming concepts in the context of animation – remains the same.
In the formative phase of this project, our goal was to understand the environment and local culture and evaluate the opportunities and challenges. The lessons learned in this phase are being used to formulate the future direction of our research. Although the Middle Eastern countries are rapidly modernizing, the desire to maintain traditional norms is strong. For this reason, we compiled two lists of models. One list was of existing models in the Alice gallery that are not appropriate for the Middle Eastern culture. Qatar (and Middle East in general) is a religious society that follows conservative norms in dress for both men and women. The second was a list of models that would be interesting and relevant to the Qatari society. These two lists helped us determine those models that might be modified, removed, or added to the gallery. We found that Qatar is a cultural society with a lot of emphasis on local customs. Local hospitality, religion, and traditional professions like pearl diving, fishing, and police officers have special place in society. We also discovered that people in general and boys in particular, have a special respect for camels and the desert.
We created the curriculum in collaboration with one private school and the Supreme Education Council. Creating artifacts in isolation and expecting the educational systems to adopt them is not a prudent approach. Due to this collaboration, we learned that a majority of existing ICT and computing curriculum is based on step-by-step instructions that students are expected to follow and reproduce. There is a lack of stress on student learning, creativity, and application.
Most ICT teachers in Qatar, both in public and private schools, are trained ICT professionals. At the same time, most of these teachers are not familiar with program visualization tools such as Alice and have not taught fundamental programming concepts in their classes. As a result, the need for professional development workshops is urgent. We have conducted several workshops for teachers to help them use Alice during the pilot study of the curriculum. During these workshops, we focus on two main concepts – learning to use Alice as a tool, and learning to teach computational thinking using Alice.
We have piloted the curriculum, instructional materials, and the new 3-D models in the Alice gallery for middle school students in one private English school and two public Arabic schools. The pilot test involved more than 400 students in the three schools combined. During the pilot testing, we conducted a survey to obtain initial feedback regarding the 3D models from the Middle East gallery (students have access to all models that are part of Alice's core gallery). Through these surveys, we learned that those objects that students use in everyday life were more popular when it came to using models in Alice.
As part of the curriculum to teach computational thinking using Alice as a tool, we have created several artifacts which are made available to the schools. These items include:
Academic plan for the year
Learning out comes for each lecture, PowerPoint presentation, class exercises, and assessment questions
Student textbook – One English book for grade 8, one Arabic textbook for grade 8 and one Arabic textbook for grade 11.
One of the most important skills that is essential for building the future generation is critical thinking. Although, we are currently only looking at the acceptability of the newly created 3-D models and usability of our curriculum and instructional material, we are still curious about the effectiveness of this curriculum in teaching computational thinking. We analyzed the results of an exam conducted at a local school and it was observed that students in grade 7 with Alice based curriculum performed better than those in grade 9 on the same exam. This exam was designed to measure the critical thinking skill in problem solving without any reference to Alice. We hope that this result is directly related to the students' experience with Alice, as it works on making the students think about the problem from different perspectives. We acknowledge that still more formal work needs to be done in order to support our hypothesis.
This academic year (2015–2016), Alice based curriculum is being used in Arabic in six independent schools and in English in four private English schools. There are more than 1400 students currently studying this content.
-
-
-
Towards a K-12 Game-based Educational Platform with Automatic Student Monitoring: “INTELLIFUN”
Authors: Aiman Erbad, Sarah Malaeb and Jihad Ja'amSince the twenty-first century, digital technologies are increasingly supporting teaching and learning activities. Because learning is effective when it starts early, advanced early years' educational tools are highly recommended to help new generations gain the necessary skills to successfully build opportunities and progress in life. With all the digital learning advances, there are still many problems that teachers, students and parents are facing. Students' learning motivation, problem solving ability remain weak while working memory capacity is found low for children under 11 years old which cause some learning difficulties, such as developmental coordination disorder, mathematics calculation, and language impairments. The latest PISA, Programme for International Assessment, shows that Qatar has seen the lowest scores compared to other countries with similar condition in mathematics, sciences and reading performance and is ranked 63rd of 65 countries involved, even though the Qatari GDP, General government expenditures, is high (OECD 2012).
Another problem affecting the educational experience for young children is family engagement. Parents need to be more involved in the learning process and have quick and timely detailed feedback about their children's progress in different topics of study. In fact, the schools days are limited and parents can play an important role in improving their children progress in learning and understanding concepts. The traditional assessment tools provide global grading usually by topics of study (e.g., Algebra). Parents need a grading system by learning skills (e.g., addition facts to 10, solving missing-number problems, subtraction of money) to have a clear view about the specific skills that their children need to improve. Finally, teachers need also an automated skills-based students monitoring tool to observe students' progress in correspondence with the learning objectives to focus on personalized tutoring tactic and take accurate decisions. Such a tool allows the teachers to focus more on the students' weaknesses and take the necessary actions to overcome with these problems.
Recent studies showed that students can become more motivated to learn with game-based learning tools. These interactive elements facilitate problem solving and make learning new concepts easier and encourage the students to work harder at school and also at home. Active learning using game-based model guarantees long-term retention of information which help the students increase their exam scores while acquiring the needed skills appropriately. We have conducted a survey and analyzed the features of 31 leading existing technologies in the digital learning industry. We found that only 21 of them offer educational games, 22 are dedicated for elementary age range, 15 offer digital resources to support mathematics and sciences, 11 consider some digital tools for assessment to test children skills and 6 include automated progress reporting engine where most of them need manual data entry support from teachers. There is a need for a complete solution of a game-based learning platform with automatic performance and progress reporting without any manual intervention, and in particular customized to fit with elementary schools curriculum standards.
We developed an educational platform called ‘IntelliFun’ that uses educational games to automatically monitor and asses the progress of elementary schools children. It can be applied to wider scope of course using outcome-based learning using games. Our intelligent game-based ‘IntelliFun’ platform provides a potential solution for many serious issues faced in education. Its entertainment gaming features improve students' learning motivation, problem solving ability and working memory capacity. In parallel, its students' performance monitoring features empower family engagement. Having these features integrated in one technology, makes ‘IntelliFun’ a novel creative solution in digital education.
To generate students' outcomes while playing the games, we have to use an effective technology to relate the curriculum standards and the learning objectives with the game worlds' content (i.e., scenes and activities). The technology we have used is an ontology-based approach. We have designed a new ontology model to map the programs curriculums and learning objectives with the flow-driven game worlds' elements. The children' performance is evaluated through the ontology using information extraction with an automated reasoning mechanism that is guided by a set of inference rules. Technically, using ontologies in the field of education and games is very challenging and our data model forms a novel solution to two issues:
• The complexity of designing educational data models where learning objectives and curriculum standards are matched and incorporated in serious games, and
• The complexity of providing advanced reasoning over the data.
This allows the fusion of many challenging technologies: digital education, semantic web, games, monitoring systems and artificial intelligence.
Our work is deeply rooted in the state of the art in educational games and digital education systems. The curriculum ontology was inspired by the British Curriculum Ontology (BBC 2013). The instances related to learning objectives are extracted from the elementary curriculum standards of Supreme Education Council of Qatar. Ontology model in games follows story-based scenarios described with Procedural Content Generation (Hartsook 2011) and HoloRena (Juracz 2010). We used the trajectory trace ontology described in STSIM (Corral 2014) to design the student monitoring ontology. To evaluate student's performance, we used inference rules based-reasoning engine to query correct, incorrect and incomplete actions performed by the player as described in Ontology-based Information Extraction (Gutierrez 2013). To measure the learner's performance, certain key indicators should feed the reasoning engine which executes appropriate calculation methods.
The platform is implemented in a 3-tier architecture where mobile game applications are used. These games can query and update the ontology in real time through a web service by invoking data management, reasoning, monitoring and reporting operations using Apache Jena Ontology API. The platform can be used to dynamically generate the content of the games based on the children' preferences and acquired knowledge. The platform monitoring features allow the teachers to focus on the children' achievement of every learning objective and empower also the parents' engagement in their children's learning experience. In fact, they can follow up the children and know their weaknesses and strengths. ‘IntelliFun’ is used to improve the children's learning outcomes and keep them motivated while playing games.
We aim to start testing our platform with real users. We will use the mathematics curriculum for grade 1 as case study. Our user study will include students, parents and teachers who will answer an evaluating questionnaire after testing the technology. This will help us evaluate the efficacy of the platform and ascertain its benefits by analyzing its impacts in improving students' learning experience. An interesting research direction to take into consideration in future work is the use of data mining techniques in the reasoning engine to evaluate students' performance with complex performance key indicators. We can also consider dynamic generation of game worlds' content based on students' preferences and acquired learning skills.
-
-
-
Enhancing Information Security Process in Organisations in Qatar
More LessDue to the universal use of technology and its pervasive connection to the world, organisations have become more exposed to frequent and various threats (Rotvold, 2008).Therefore, organisations today are giving more attention to information security as it has become a vital and challenging issue. Mackay (2013) noted that the significance of information security, particularly information security policies and awareness, is growing due to the increasing use of IT and computerization. Accordingly, information security presents a key role in the internet era of technology. Gordon & Loep (2006) stated that information security involves a group of actions intended to protect information and information systems. It involves software, hardware, physical security and human factors, where each element has its own features. Information security not only secures the organisation's security but the complete infrastructure that enables the information's use. Organisations are facing an increase in daily security breaches, especially information that is more accessible to the public as the threat becomes greater. Therefore security requirements need to be tightened.
Information security policies control employees' behavior as well as securing the use of hardware and software. Organisations benefit from implementing information security policies as it helps them to classify their information assets and define the importance of the information assets to the organisation (Canavan, 2003). Information security policy as a number of principles, regulations, methodologies, procedures and tools created to secure the organisation from threats. Boss and Kirsch (2007) stated that employees' compliance with information security polices has become an important socio-organizational resource. Information security policies are applied in organisations to provide the employees with guidelines to guarantee information security.
Herold (2010) expressed the importance for organisations to have constant training programmes and educational awareness to attain the required result from the implementation of an information security policy. Security experts' emphasise the importance of security awareness programmes and how they improve information security as a whole. Nevertheless, implementing security awareness in organisations is a challenging process as it requires actively interacting with an audience that usually does not know the importance of information security (Manke, 2013). Organisations tend to use advanced security technologies and constantly train their security professionals, while paying little attention on enhancing the security awareness of employees and users. This makes employees and users the weakest link in any organization (Warkentin & Willison, 2009).
In the last ten years, the state of Qatar has witnessed remarkable growth and development of its civilization, having embraced information technology as a base for innovation and success. The country has perceived tremendous improvement in the sectors of health care, education and transport (Al-Malki, 2015). Information technology plays a strategic role in building the country's knowledge based economy. Due to the country's increasing use of internet and being connected to the global environment, Qatar needs to adequately address the global threats arising from the internet. The global role of Qatar in world politics has led Qatar to not just face the traditional threats from hackers, but more malicious performers such as terrorists, organized criminal networks and foreign government spying. Qatar has faced a lot of discomfort with some countries which try to breach the county's security. Qatar Computer Emergency Response Team (Q-CERT) who is responsible of addressing the state's Information Security needs stated “As Qatar's dependence on cyberspace grows, its resiliency and security become even more critical, and hence the needs for a comprehensive approach that addresses this need” (Q-CERT, 2015). Therefore Q-CERT established National Information Assurance policy (NIA), which is an information security policy designed to help both government and private sectors in Qatar, to protect their information and enhance their security. Nevertheless the NIA policy has not been implemented still in any organization in Qatar. This is due to the barriers and challenges of information security in Qatar such as culture and awareness, which make the implementation of information security policies a challenging approach.
As a result, the scope of this research is to investigate information security in Qatar. There are many solutions for information security, some are technical and others are non-technical, such as security policies and information security awareness. This research focusses on enhancing information security through non-technical solutions, in particular information security policy. The aim of this research is to enhance information security in organizations in Qatar by developing a comprehensive Information Security Management System (ISMS) that considers the country-specific and cultural factors of Qatar. ISMS is a combination of policies and frameworks which ensure information security management (Rouse, 2011). This information security management approach is unique to Qatar as it considers Qatar culture and country specific factors. Although there are a lot of international information security policies available, such as ISO27001 but this research shows that sometimes these do not address the security needs particular to the culture of the country. Therefore there was a need to define a unique ISMS approach for Qatar.
To accomplish the aim of this research the following objectives must be achieved.
1. To review literature on information security in general and in Qatar in particular.
2. To review international and local information security standards and policies.
3. To explore the NIA policy in Qatar and compare it with others in the region and internationally.
4. To define problems with implementing information security policies and NIA policy in particular in organisations in Qatar.
5. To provide recommendations for the new version of the NIA policy.
6. To assess the awareness of employees on information security.
7. To assess the information security process in organisations in Qatar.
8. To identify the factors which affect information security in Qatar including culture and country specific factors.
9. To propose an ISMS for Qatari organisations taking into consideration the above factors.
10. To define a process for organisations to maintain the ISMS.
11. To evaluate the effectiveness of the proposed ISMS.
To achieve the aim of this research, different research methodologies, strategies and data collection methods will be used, such as literature review, surveys, interviews and case study. The research undergoes three phases, currently the researcher has completed phase one of the research which analyses the field of information security and highlights the gaps in the literature that can be investigated further in this research. It also examines the country factors that affect information security and the implementation of such information security policies. While undertaking interviews with experts in the field of information technology, information security, culture and law, to identify the situation of Information Security in Qatar, and the factors which might affect the development of Information Security in Qatar including the cultural effect, legal and political issues. In the following two years, the researcher will complete phase two and three of the research. During phase two the researcher will measure the awareness of employees and their knowledge of information security and information security policies in particular. The finding will help the researcher in completing phase three which involves investigating further the NIA policy and a real implementation of ISMS in an organisation in Qatar, and analyses the main findings to finally providing recommendations for improving NIA policy.
In conclusion, the main contribution of this research is to investigate the NIA policy and the challenges facing its implementation, and then define an ISMS process for the policy to assist organisations in Qatar in implementing and maintaining the NIA policy. The research is valuable since it will perform the first real implementation of the NIA policy in an organisation in Qatar taking advantage of the internship the researcher had with ICT. The research will move the policy from paper-based form into a real ISMS system and oversees it in reality in one of the organizations.
Information security, National Information Assurance policy, Information Security Management System, Security Awareness, Information Systems
References
Rotvold, G. (2008). How to Create a Security Culture in Your Organization. Available: http://content.arma.org/IMM/NovDec2008/How_to_Create_a_Security_Culture.aspx. Last accessed 1st Aug 2015.
Manke, S. (2013). The Habits of Highly Successful Security Awareness Programs: A Cross-Company Comparison. Available: http://www.securementem.com/wp-content/uploads/2013/07/Habits_white_paper.pdf. Last accessed 1st Aug 2015.
Al-Malki. (2015). Welcome to Doha, the pearl of the Gulf. Available: http://www.itma-congress-2015.com/Welcome_note_2.html. Last accessed 4th May 2015.
Rouse, M. (2011). Information security management system (ISMS). Available: http://searchsecurity.techtarget.in/definition/information-security-management-system-ISMS. Last accessed 22th Aug 2015.
Q-CERT. (2015). About Q-CERT. Available: http://www.qcert.org/about-q-cert. Last accessed 1st Aug 2015.
Warkentin, M., and Willison, R. (2009). “Behavioral and Policy Issues in Information Systems Security: The Insider Threat,” European Journal of Information Systems (18:2), pp. 101–105.
Mackay, M. (2013). AN EFFECTIVE METHOD FOR INFORMATION SECURITY AWARENESS RAISING INITIATIVES. International Journal of Computer Science & Information Technology. 5 (2), p 63–71.
Gordon, L. A. & Loep, M. P. (2006). Budgeting Process for Information Security Expenditures. Communications of the ACM. 49 (1), p 121–125.
Herold. R (2010). Managing an Information Security and Privacy Awareness and Training Program. New York: CRC Press.
Boss, S., & Kirsch, L. (2007). The Last Line of Defense: Motivating Employees to Follow Corporate Security Guidelines. International Conference on Information Systems. unknown (unknown), p 9–12.
Canavan, S. (2003). An Information Security Policy Development Guide for Large Companies. SANS Institute.
-
-
-
Visible Light Communication for Intelligent Transport Systems
Authors: Xiaopeng Zhong and Amine BermakIntroduction
Road safety is a world-wide health challenge that is of great importance in Qatar. According to WHO [1], global traffic fatalities and injuries are in the millions per year. Qatar has one of the world's highest rate of traffic fatalities, which causes more deaths than common diseases [2]. Traffic congestion and vehicle fuel utilization are two other major problems. Integrating vehicle communication into intelligent transport systems (ITS) is important as it will help improve road safety, efficiency and comfort by enabling a wide variety of transport applications. Radio frequency communication (RFC) technologies do not meet the stringent transport requirements due to spectrum scarcity, high interference and lack of security [3]. In this work, we propose an efficient and low-cost visible light communication (VLC) system based on CMOS transceivers for vehicle-to-vehicle (V2V) and infrastructure-to-vehicle (I2V) communication in ITS, as a complementary platform to RFC.
Objective
The proposed VLC system is designed to be low cost and efficient, supporting various V2V and I2V communication scenarios as shown in Fig. 1. The VLC LED transmitters (Tx) are responsible for both illuminating and information broadcasting. They are designed to support various existing transport infrastructures (such as street lamps, guideboards and traffic lights) as well as vehicle lights, with low cost and complexity. The receivers (Rx) will be available on both the front and back sides of vehicles with both vision and communication capabilities. Robustness of communication is enhanced by the added vision capability.
System implementation
The VLC system implementation in Fig. 2 is an optimized joint design of the transmitter, receiver and communication protocol. The LED transmitter will focus on the design of LED driver with efficient combination of illumination and communication modulation schemes. Light sensor is integrated to provide adaptive feedback for better power efficiency. Polarization techniques are utilized to cancel background light so as to not only enhance image quality but also improve robustness of VLC, as shown in Fig. 3.(a) [4]. A polarization image sensor using liquid crystal micro-polarimeter array has been designed as illustrated in Fig. 3. (b). The CMOS visible light receiver will be designed based on traditional CMOS image sensor but with innovative architecture specifically for V2V and I2V VLC. It features dual readout channels, namely, a compressive channel for image capture and a high-speed channel for VLC. Novel algorithms for detection and tracking are used to improve communication speed, reliability and security. Compressive sensing is applied for image capture. The compression is facilitated by a novel analog-to-information (AIC) conversion scheme which leads to significant power savings in image capture and processing. A prototype AIC based image sensor has been successfully implemented as shown in Fig. 4 [5]. A VLC protocol is specifically tailed for V2V and I2V based on the custom transceivers. The PHY layer is designed based on MIMO OFDM and the MAC layer design is based on dynamic link adaption. The protocol is to be an extension and optimization of IEEE 802.15.7 standard for V2V and I2V VLC. A preliminary prototype VLC system has been designed to verify the feasibility. A Kbps-level VLC channel has been achieved under illumination levels from tens to hundreds of lux. It's anticipated better improvement will be obtained with further research using the novel techniques described above.
Conclusion
An efficient and low-cost visible light communication system is proposed for V2V and I2V VLC, featuring low cost and power-efficient transmitter design, dual-readout (imaging and VLC) receiver architecture, fast detection and tracking algorithms with compressive sensing, polarization techniques and specific communication protocol.
References
[1] Global status report on road safety 2013, World Health Organization (WHO).
[2] Sivak, Michael, “Mortality from road crashes in 193 countries”, 2014.
[3] Lu, N.; Cheng, N.; Zhang, N.; Shen, X.S.; Mark, J.W., “Connected Vehicles: Solutions and Challenges,” IEEE Internet of Things Journal, vol. 1, no. 4, pp. 289–299, Aug. 2014.
[4] X. Zhao, A. Bermak, F. Boussaid and V. G. Chigrinov, “Liquid-crystal micropolarimeter array for full Stokes polarization imaging in visible spectrum”, Optics Express, vol. 18, no. 17, pp. 17776–17787, 2010.
[5] Chen, D.G.; Fang Tang; Law, M.-K.; Bermak, A., “A 12 pJ/Pixel Analog-to-Information Converter Based 816 × 640 Pixel CMOS Image Sensor,” IEEE Journal of Solid-State Circuits, vol. 49, no. 5, pp. 1210–1222, 2014.
-
-
-
A General Framework for Designing Sparse FIR MIMO Equalizers Based on Sparse Approximation
Authors: Abubakr Omar Al-Abbasi, Ridha Hamila, Waheed Bajwa and Naofal Al-DhahirIn broadband communications, the long channel delay spread, defined as the duration in time, or samples, over which the channel impulse response (CIR) has significant energy, is too long and results in a highly-frequency-selective channel frequency response. Hence, a long CIR can spread over tens, or even hundreds, of symbol periods and causes impairments in the signals that have passed through such channels. For instance, a large delay spread causes inter-symbol interference (ISI) and inter-carrier interference (ICI) in multi-carrier modulation (MCM). Therefore, long finite impulse response (FIR) equalizers have to be implemented at high sampling rates to avoid performance degradation. However, the implementation of such equalizers is prohibitively expensive as the design complexity of FIR equalizers grows proportional to the square of the number of nonzero taps in the filter. Sparse equalization, where only few nonzero coefficients are employed, is a widely-used technique to reduce complexity at the cost of a tolerable performance loss. Nevertheless, reliably determining the locations of these nonzero coefficients is often very challenging.
In this work, we first propose a general framework that transforms the problem of design of sparse single-input single-output (SISO) and multiple-input multiple-output (MIMO) linear equalizers (LEs) into the problem of sparsest-approximation of a vector in different dictionaries. In addition, we compare several choices of sparsifying dictionaries under this framework. Furthermore, the worst-case coherence of these dictionaries, which determines their sparsifying effectiveness, are analytically and/or numerically evaluated. Second, we extend our framework to accommodate SISO and MIMO non-linear decision-feedback equalizers (DFEs). Similar to the sparse FIR LEs design problem, the design of sparse FIR DFEs can be cast into one of sparse approximation of a vector by a fixed dictionary whose solution can be obtained by using either greedy algorithms, such as Orthogonal Matching Pursuit (OMP), or convex-optimization-based approaches, with the former being more desirable due to its low complexity. Third, we further generalize our sparse design framework to the channel shortening setup. Channel shortening equalizers (CSEs) are used to ensure that the cascade of a long CIR and the CSE is approximately equivalent to a target impulse response (TIR) with much shorter delay spread. Channel shortening is essential for communication systems operating over highly-dispersive broadband channels with large channel delay spread. Fourth, as an application of recent practical interest for power-line communication (PLC) community, we consider channel shortening for the impulse responses of medium-voltage power-lines (MV-PLs) with length of 10 km and 20 km to reduce the cyclic prefix (CP) overhead in orthogonal frequency-division multiplexing (OFDM) and, hence, improves the data rate accordingly. For all design problems, we propose reduced-complexity sparse FIR SISO and MIMO linear and non-linear equalizers by exploiting the asymptotic equivalence of Toeplitz and circulant matrices, where the matrix factorizations involved in our design analysis can be carried out efficiently using the fast Fourier transform (FFT) and inverse FFT with negligible performance loss as the number of filter taps increases.
Finally, the simulation results show that allowing for a little performance loss yields a significant reduction in the number of active filter taps, for all proposed LEs and DFEs design filters, which in turn results in substantial complexity reductions. The simulation results also show that the CIRs of MV-PLs with length of 10 km and 20 km can be shortened to fit within the broadband PLC standards. Additionally, our simulations validate that the sparsifying dictionary with the smallest worst-case coherence results in the sparsest FIR filter design. Furthermore, the numerical results demonstrate the superiority of our proposed approach compared to conventional sparse FIR filters in terms of both performance and computational complexity. Acknowledgment: This work was made possible by grant number NPRP 06-070-2-024 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
-
-
-
Novel Vehicle Awareness Measure for Secure Road Traffic Safety Applications
Authors: Muhammad Awais Javed and Elyes Ben HamidaFuture intelligent transport systems (ITS) are envisaged to offer drivers with a safer and comfortable driving experience by using wireless data exchange between vehicles. A number of applications could be realized with the increased vehicle vision and awareness provided by this technology known as Vehicular Ad hoc Network (VANETs). These applications include cooperative awareness, warning notification, safe lane change and intersection crossing, intelligent route selection, traffic management, parking selection, multi-player games and internet browsing.
The success of VANETs and its proposed applications depend on secure and reliable message transmission between the vehicles. Every vehicle broadcasts periodic safety messages to the neighborhood traffic to inform about its presence. This safety message contains vehicle's mobility information including its location, speed, direction, heading etc. Based on these safety messages, vehicles develop a local dynamic map (LDM) that provides them a complete description of the surrounding traffic. Using LDM, vehicles could look beyond line of sight and make safe and intelligent driving decisions.
An increased level of vehicle safety awareness is the primary goal for road safety applications. An accurate measure of this awareness is critical to evaluate impact of different parameters such as security, vehicle density etc. on vehicle safety and application quality of service. A precise and correct metric for safety awareness of vehicles should take into account the knowledge of vehicle's surrounding and accuracy of received information in CAM and LDM. Existing metrics in the literature utilize quantitative measure of awareness such as packet delivery ratio and do not consider accuracy and fidelity of received information in the LDM. Due to GPS error and outdated information in the LDM, vehicles could have a reduced level of awareness resulting in dissemination of false positives and false negatives that could badly impact road safety applications.
In this paper, we propose two novel metrics for evaluating vehicle safety awareness. These metrics start by using our proposed vehicle heading based filtering mechanism to only consider the critical neighbors in the surrounding (i.e., the ones that are moving towards a vehicle and have a chance to collide with it) for calculating awareness. The first metric known as Normalized Error based Safety Awareness Level (SAL) calculates awareness by measuring the number of neighbors a vehicle has successfully discovered in its LDM and a normalized distance error that is calculated based on actual position of each neighbor and its position information that is available in the LDM. By considering the position error in the information contained in the LDM, vehicles accurately measure their awareness levels.
To further improve the above safety awareness metric, we propose a weighted Normalized Error based Safety Awareness Level (wSAL) metric that assigns higher weight to error coming from neighbor vehicles that are nearby using a sigmoid function. Since position error of a closer neighbor is more critical in safety applications, vehicle awareness level could be more accurately measured by allocating a higher importance to them.
We developed a simulation model using NS-3 network simulator and SUMO traffic simulator to generate realistic road traffic scenario at different vehicle densities. Simulation results verify that the existing metrics provide optimistic results for vehicle awareness and our proposed metrics improve the measure of awareness. This leads to a better performance evaluation of safety applications.
-
-
-
Energy Efficient Antenna Selection for a MIMO Relay Using RF Energy Harvesting
Authors: Amr Mohamed and Islam SamyDue to rapid growth in traffic demands and the number of subscribers, the transmit energy consumption becomes critical, both environmentally and economically. Increasing energy efficiency for wireless networks is the main goal for the 5G network research. The research community has proposed promising solutions supporting green communication techniques. However, energy efficiency can also be enhanced in a different way. We can get energy from the renewable sources, which can compensate (totally or partially) the traditional power consumption from the power grid. Energy harvesting has emerged as a promising technique which helps to increase the sustainability of wireless networks. In this paper, we investigate energy efficient antenna selection schemes for a MIMO relay powered by a hybrid energy source (from grid or through RF energy harvesting). We try to utilize the large number of antennas efficiently for both data decoding and energy transfer. Then, we formulate an optimization problem and provide the optimum antenna selection scheme such that the joint power consumption (source and relay power) is minimized while meeting the rate requirements. The problem is categorized as a mixed non-linear integer and non-convex, i.e., prohibitively complex. We propose two special cases of the general problem, Fixed Source Power Antenna Selection (FSP-AS), in which we assume a fixed source power and control the antenna selection only, and All Receive Antenna Selection (AR-AS), in which we choose to turn all receiving antennas ON. Finally, we introduce two less complex heuristics, Decoding Priority Antenna Selection (DP-AS) and Harvesting Priority Antenna Selection (HP-AS). Finally we compare our work with the Generalized Selection Combiner (GSC) scheme used in some previous works.
The main contributions of our work can be summarized as follows:
(1) We introduce the energy harvesting technique as an effective way to improve the energy efficiency by using it as a substitute for the grid energy.
(2) In addition to the transmitted energy, we model the circuit power as an important part of the total energy consumption which can affect the energy efficiency.
(3) We make a possibility to turn each antenna ON or OFF individually, so we can turn off only the antennas we don't need to save the energy as much as possible.
(4) We introduce two special case schemes, each of them care about a special type of energy consumption, FSP-AS scheme cares more about the circuit energy, while the
AR-AS concentrates mainly on the transmitted energy.
(5) We also propose two heuristics to accommodate the complexity of the target problem. We evaluate the performance for the proposed schemes numerically. Our key performance indicator (KPI) is the joint power consumed in both the source and the relay. The simulation results show the gain of our optimal scheme in terms of energy efficiency, which can be up to 80% as compared to solutions proposed in the literature. Our developed heuristics show reasonable performance at small rate with almost no gap with the optimal scheme at higher target rates. In our future work, we will consider modeling more than one source and destination nodes and extend this model to include interference scenarios.
-
-
-
Green Techniques for Environment-Aware 5G Networks
Authors: Hakim Ghazzai and Abdullah KadriOver the last decade, mobile communications have been witnessing an unprecedented rise of mobile user demand that is perpetually increasing due to the introduction of new services requiring extremely fast and reliable connectivity. Moreover, there is an important increase of the number of devices connected to cellular networks because of the emergence of the machine-type communication and internet of things. Indeed, data traffic on mobile networks is increasing at a rate of approximately 1.5 to 2 times a year, therefore mobile networks are expected to handle up to 1000 times more data traffic in 10 years time. Because of this huge number of wireless terminals, in addition to the deployed radio access networks (RANs) necessary to serve them, future fifth-generation (5G) cellular networks will suffer from an enormous growth of energy consumption that will cause negative economical and environmental impacts. It is predicted that if no actions are taken, the greenhouse gas (GHG) emissions per capita for ICT are estimated to increase from 100 kg in 2007 to about 130 kg in 2020. Therefore, there is an urgent obligation to develop new techniques and technologies in order to cope up with the exponential energy growth and correspondingly the carbon emission of emerging wireless networks. From a cellular network operator perspective, reducing fossil fuel consumption is not only for behaving green and responsible towards the environment, but also for solving an important economical issue that cellular operators are facing. Indeed, such energy consumption forces mobile operators to pay huge energy bills which actually constitute around the half of their operating expenditures (OPEX). It was shown that, currently, cellular networks consume around 120 TWh of electricity per year and mobile operators pay around 13 billion dollars to serve 5 billion connections per year.
Therefore, there is a growing necessity to develop more energy-efficient techniques to enhance their green performance while respecting the user's quality of experience. Although most of the proposed studies were focusing on individual physical layer power optimizations, more sophisticated and cost-effective technologies should be adopted to meet the green objective of 5G cellular networks. This study investigates three important techniques that could be exploited separately or together in order to enable the wireless operators achieve significant economic benefits and environmental savings:
- Cellular networks powered by the smart grid: Smart grid is widely seen as one of the most important means that enhance energy savings and help optimize some of consumers' green goals. It can considerably help in reducing GHG emissions by optimally controlling and adjusting the consumed energy. Moreover, it allows the massive integration of intermittent renewable sources and offers the possibility to deliver electricity in a more cost-effective way with active involvement of customers in the procurement decision. Therefore, introducing the concept of smart grid as a new tool for managing the energy procurement of cellular networks is considered as an important technological innovation that would significantly contribute to the reduction of mobile CO2 emissions.
- Base station sleeping strategy: Several studies show that over 70% of the power is consumed by base stations (BSs) or long term evolution eNodeB (LTE-eNB) for 4G networks. Turning off redundant or lightly loaded BSs during off-peak hours can contribute to the reduction of mobile network energy consumption and GHG emissions.
- Green networking collaboration among competitive mobile operators: The fundamental idea was to completely turn off the equipment of one service provider and serve the corresponding subscribers by infrastructure belonging to another operator. However, random collaboration may lead to the increase of certain mobile operator's profit at the expense of other competitive operators. This can cause a high energy consumption and a very low profit for the active network. Therefore, fairness criteria should be introduced for this type of collaboration.
In this study, we present in detail the techniques described above and provide multiple simulation results measuring the gain that could be obtained using these techniques compared to that of traditional scenarios.
-
-
-
Vibration Energy Harvesting in Wireless Sensor Networks (WSNs) for Structural Health Monitoring (SHM)
Authors: Loay Ismail, Sara Elorfali and Tarek ElfoulyHarvesting of vibration energy from the ambient environment, such as vibrations experienced by bridges due to vehicle movements, wind, earthquakes, has become an essential area of study by many scientists aiming to design new systems which can improve self-powered network sensors in wireless sensor networks (WSN), thus providing a more efficient system that does not require the human involvement.
One of the essential components of WSN systems is the sensor node. It is used to continuously send/receive information to monitor a certain behavior targeted by the application; for example, to monitor bridge infrastructure's health. Sometimes, sensors are programmed and adjusted to send useful data for monitoring 24 hours a day, seven days a week. This configuration harms the sensors' batteries and shortens their lives, since sending/receiving data consumes power and leads to the reduction of the batteries' voltage levels. Due to this fact, energy harvesting is critical to maintaining long-term batteries that can recharge themselves from the available ambient harvested energy and eliminate the need for human involvement in replacing or recharging them in their specified locations in the network.
Recent structural health monitoring systems (SHM), in civil infrastructure environments, have focused heavily on the use of wireless sensor networks (WSNs) due to their efficient use of wireless sensor nodes. Such nodes can be fixed onto any part of the infrastructure, such as bridges, to collect data remotely for monitoring and further processing. However, the drawback of using such sensor networks relies mainly on the finite life-time of their batteries. Due to this problem, the concept of harvesting energy from the ambient environment became more important. Ensuring efficient battery usage would have a great benefit in maximizing overall systems functionality time and ensures efficient use of natural energy resources like solar, wind and vibration energies.
This work aims to study the feasibility of using a piezoelectric vibration energy harvester to extend overall battery life using a single, external, super-capacitor component which is serving as a storage unit for the harvested energy. The methodology followed in this work states the general direction of the flow of energy in a sensor node which can be summarized into the following:
1 Piezoelectric Vibration Energy Harvester: This module was used to convert mechanical energy of the vibrations from the ambient environment to electrical energy.
2 Energy Harvesting Circuit: This circuit is responsible for the power conditioning, enabling the circuit to output energy to the sensors under certain threshold criteria.
3 Energy Storage: This the super-capacitor served to store harvested energy.
4 Energy Management Scheme: The scheme proposed by this work under the energy requirements and constraints of the sensor nodes in order to conserve batteries voltage level to extend sensors' batteries lives.
5 Wireless Sensors Nodes: Each sensor node type has specific energy requirements that must be recognized so that it can be adequately powered and turned on using the harvested energy.
The main contribution of this work is a proposal of an energy management scheme which ensures that the harvested energy being provided to the harvester circuit must be greater than the energy output that is going to be consumed by the sensor. This proposed scheme has proved the feasibility of using impact vibrations for efficient energy harvesting and subsequently increase the battery life time needed to turn on the wireless sensor nodes.
Furthermore, as a future direction of work, to increase the amount of harvested energy, hybrid power sources can be explored by combining more than one energy source from the ambient environment, such as solar and vibration energy.
-
-
-
Design of a Novel Wireless Communication Scheme that Jointly Supports Both Coherent and Non-Coherent Receivers
Authors: Mohammad Shaqfeh, Karim Seddik and Hussein AlnuweiriAs well known, wireless channels are characterized by temporal and spatial variations due to different reasons such as multipath propagation of the signals, and mobility of the communicating devices or their surrounding environment. This has two consequences. First, the channel quality (i.e. amplitude) varies resulting in changes in the amount of data rate (in bits/sec) that can be received reliably over the channel. Second, the channel “phase” varies, which necessities the ability of the receiver to track these changes reliably in order to demodulate the transmitted signal correctly. This is called “coherent” transmission. If the receiver is unable to track the phase variations, then the transmitter should use “non-coherent” modulation schemes, which can be detected without phase knowledge at the receiver. However, this will be at the cost of significant degradation of the data rate that can be transmitted reliably. Therefore, modern communication systems are supported with channel estimation algorithms in order to enable coherent reception. However, this is not always feasible. Channel estimation is usually accomplished by transmitting pilot signals with some frequency. Depending on the frequency of pilot transmission and the channel coherence time, some receivers might have reliable channel estimates and other receivers might not have that reliable channel estimates. This is one reason why each mobile wireless standard supports some maximum velocities for the mobile users, limited by the frequency of pilot transmission. Mobile users moving at higher speeds might not have reliable channel estimates and this means that they will not be able to receive any information via coherent transmission.
Having this in mind, we are mainly interested in this work in broadcasting systems like mobile TVs. These systems are usually transmitted using coherent modulation schemes in order to enable good quality of reception which cannot be maintained using the “low-rate” non-coherent modulation schemes. Therefore, mobile users with unreliable channel estimates will not be able to receive such applications, while the users with reliable channel estimates can receive the transmitted stream reliably. Therefore, broadcasting applications are characterized by “all or nothing” reception depending on the mobility and the channel conditions of the receiving terminals.
Alternatively, we propose a layered coding scheme from a new viewpoint that has not been addressed before in the literature. We propose a layered transmission scheme with two layers, one layer, base-layer (non-coherent-layer), that can be decoded by any receiver even if it does not have reliable channel estimates, and the other, refining-layer (coherent-layer) that can be only decoded at receivers with reliable channel estimates. The basic bits could be transmitted on the first layer and the extra bits that improve quality could be transmitted on the second layer. Therefore, receivers with unreliable channel estimates can decode the non-coherent layer and the receivers with reliable channel estimates can decode all the information and experience better service quality.
This proposed scheme can be designed using multi-resolution broadcast space-time coding, which allows the simultaneous transmission of low rate (LR) non-coherent information for all receivers, including those with no channel state information (CSI), and high rate (HR) coherent information to those receivers that have reliable CSI. The proposed scheme ensures that the communication of the HR layer is transparent to the underlying LR layer. We can show that both the non-coherent and coherent receivers achieve full diversity, and that the proposed scheme achieves the maximum number of communication degrees of freedom for non-coherent LR channels and coherent HR channels with unitarily-constrained input signals.
-
-
-
Wearable D2D Routing Strategies for Urban Disaster Management – A Case Study
Authors: Dhafer Ben Arbia, Muhammad Mahtab Alam, Rabah Attia and Elyes Ben HamidaCritical and public safety operations require real-time communication from the incident area(s) to the distant operations command center going through the evacuation and medical support areas. Data transmitted throughout such type of network is extremely useful for decisions makers and operations' conducting. Therefore, any delay in communication may cause lives' loss. Above all, the existing infrastructure communication systems (PSTN, WiFi, 4/5 G, etc.) can be damaged and is often not available solution. An alternate option is to deploy autonomous tactical network at unpredictable location and time. However, in this context there are many challenge especially how to effectively rout or disseminate the information. In this paper, we present behavior of varied multi-hops routing protocols evaluated in a disaster-simulated scenario with different communication technologies (i.e. WiFi IEEE 802.11; WSN IEEE 802.15.4; WBAN IEEE 802.15.6). Studied routing strategies are classified: Ad hoc proactive and reactive protocols, geographic-based and gradient-based protocols. To be realistic, we have conducted our simulations by considering a Mall in Doha city in the State of Qatar. Moreover, we have generated a mobility trace to model the rescue teams and crowd movements during the disaster. In conclusion, extensive simulations showed that WiFi IEEE 802.11 is the best wireless technology to consider in an emergency urban with the studied protocols. On the other hand, gradient based routing protocol performed much better especially with WBAN IEEE 802.15.6.
Keyword: Tactical Ad-hoc Networks; Public Safety and Emergency; Routing Protocols; IEEE 802.11; IEEE 802.15.4; IEEE 802.15.6; Performance Evaluation
I. Introduction
Public safety is a worldwide governments' concern. It is a special continuous reactive set of studies, operations and actions in order to predict plan and perform a successful reaction in a disaster case. Coupled with the raise of number and variety of disasters, not only the economies and infrastructures are affected, but significant number of human lives. With regards to the emergency response to these disasters, the role of existing Public Safety Network (PSN) infrastructures (e.g. TETRA, LTE) is extremely vital. However, it is anticipated that, during and after the disasters, existing PSN infrastructures can be flawed, oversaturated or completely damaged. Therefore, there is a growing demand for ubiquitous emergency response system that could be easily and rapidly deployed at unpredictable location and time. Wherefore Wearable Body Area Networks (W-BAN) is a relevant candidate that can play a key role to monitor the physiological status of involved workforces and the affected individuals. Composed by small and low-power devices connected to a coordinator, WBAN communication architecture relies on three levels: On-Body (or intra-Body), Body-to-Body (or inter-Body) and off-Body communication networks.
In disaster scenarios, the network connectivity and data is a challenging problem due to the dynamic mobility and harsh environment [1]. It is envisioned that, in case of unavailable or out-of-range network infrastructures, the WBANs coordinators along with WBANs sensors can exploit cooperative and multi-hop body-to-body communications to extend the end-to-end network connectivity. The Opportunistic and Mobility Aware Routing (OMAR) scheme is an on-body routing protocol proposed in one of our earlier works [2].
A realistic mobility model is also a major challenge related to the simulations. To the best knowledge of the authors, no comparable study in disaster context is conducted by considering a realistic disaster mobility pattern.
In this paper, we investigate varied classes of multi-hop Body-to-Body routing protocols using a realistic mobility modeling software provided by Bonn University in Germany [3]. The mobility pattern is exploited by the WSNET simulator as a mobility trace of the nodes moving during and after the disaster. Note here that individuals are considered mobile nodes in the scenario. In the conducted simulations, each iteration, one communication technology configuration is selected (i.e. WiFi IEEE 802.11; WSN IEEE 802.15.4; WBAN IEEE 802.15.6), then simulations are ran with the routing protocols (i.e. proactive, reactive, gradient-based and geographical-based). This strategy provides a vision not only on the behavior of the routing protocols in the disaster context, but evaluates the communication technologies in such case also. For proactive, reactive, gradient-based and geographical-based routing protocols, Optimized Link State Routing protocol version 2 (OLSRv2) [4], Ad-hoc On-Demand Distance Vector (AODVv2) [5], Directed Diffusion (DD) [6] and Greedy Perimeter Stateless Routing (GPSR) [7] protocols are selected respectively
The remainder of this abstract is organized as follows. In section II, we present briefly the disaster scenario considered. In section III, we explain the results of the simulations. Finally, in Section IV, we conclude and discuss perspectives.
II. Landmark disaster scenario
we are investigating a disaster scenario (fire triggering) in the “Landmark” shopping mall. The mobility model used is generated by the BonnMotion tool. We assume that the incident is caused by fire into two different locations. Rescuers are immediately called to intervene. We consider that firefighters are divided into 3 groups of vehicles with 26 firefighters in each group. Medical emergency teams that probably could reach the mall just after the incident, are consisting of 6 ambulances with 5 medical staff in each ambulance (30 rescuers in total).
Civilians could also help rescuers and they are also considered in the mobility trace generation. Injured individuals are transported from the incident areas to the patients waiting for treatment areas to get first aids. Then, they will be transported to the patients clearing areas where they will be either put under observation or evacuated to hospitals by ambulance or helicopter. A tactical command center conducting the operations is represented by WSNET is an event-driven simulator for wireless networks. WSNET is able to design, configure and simulate a whole communication stack from the Physical to the Application Layer. We benefit from these features to vary the payload with the selected MAC and routing layer in each iteration. These combinations provided a deep review on the possible communication architecture to consider in disaster operations. The following section describes the outcome of these extensive simulations.
III. Performance evaluation
The main difference between the disaster and any other scenario is the emergency aspect. All flowing data in the network is considered highly important. The probability of packets that did not reach the destination must be zero. For this reason, our evaluation is regarded to the Packet Reception rate (PRR). Likewise, a delayed packet is such like an unreceived packet. That's why we consider the delay as decisive factor. Similarly, the energy consumption is also observed.
The following table summarizes the obtained results.
In terms of average PRR, WiFi IEEE 802.11 is convincingly better than the two counterparts combined with all the routing protocols. GPSR has a considerable PRR with WBAN IEEE 802.15.6, but the location information of the nodes is considered as known. Regarding the delay, DD is particularly efficient with WiFi and WBAN.
To conclude, DD is an efficient routing protocol to consider in case of Indoor operations, while GPSR is most relevant in Outdoor operations where locations can be obtained from GPS.
IV. Conclusion
Disasters are growing remarkably worldwide. The existing communication infrastructures are not considered a part of the communication recuing system. Consequently, to monitor deployed individuals (rescue teams, injured individuals, etc.) data should be forwarded throughout these individuals (WBANs) to reach a distant command center. In order to evaluate the performance of diverse multi-hop routing protocols, we have conducted extensive simulations on WNSET using a realistic mobility model. The simulations showed that all evaluated routing protocols (i.e.AODVv2, OLSRv2, DD and GPSR) has the best PRR with the WiFi technology. While, DD was found to be the most efficient with the WBAN technology. GPSR also could be considered when the location information is available.
Acknowledgment
The work was supported by NPRP grant #[6-1508-2-616] from the Qatar National Research Fund which is a member of Qatar Foundation. The statements made herein are solely the responsibility of the authors.
References
[1] M. M. Alam, D. B. Arbia, and E. B. Hamida, “Device-to-Device Communication in Wearable Wireless Networks,” 10th CROWNCOM Conf., Apr-2015.
[2] E. B. Hamida, M. M. Alam, M. Maman, and B. Denis, “Short-Term Link Quality Estimation for Opportunistic and Mobility Aware Routing in Wearable Body Sensors Networks,” WIMOB 2014 2014 IEEE 10th Int. Conf. Wirel. Mob. Comput. Netw. Commun. WiMob, pp. 519–526, Oct-2014.
[3] N. Aschenbruck, “BonnMotion: A Mobility Scenario Generation and Analysis Tool.” University of Osnabruuck, Jul-2013.
[4] T. Clausen, C. Dearlove, P. Jacquet, and U. Herberg, “RFC7181: The Optimized Link State Routing Protocol Version 2” Apr-2014.
[5] C. Perkins, S. Ratliff, and J. Dowdell, “Dynamic MANET On-demand (AODVv2) Routing draft-ietf-manet-dymo-26.” Feb-2013.
[6] C. Intanagonwiwat, R. Govindan, and D. Estrin, “Directed diffusion: a scalable and robust communication paradigm for sensor networks,” pp. 56–67, 2000.
[7] B. Karp and H. T. Kung, “GPSR: Greedy Perimeter Stateless Routing for Wireless Networks,” Annu. ACMIEEE Int. Conf. Mob. Comput. Netw. MobiCom 2000, no. 6, 2000.
-
-
-
Robotic Assistants in Operating Rooms in Qatar, Development phase
Authors: Carlos A. Velasquez, Amer Chaikhouni and Juan P. WachsObjectives
To date, no automated solution can anticipate or detect a request from a surgeon during surgical procedures without requiring the surgeon to alter his/her behavior. We are addressing this gap by developing a system that can pass the correct surgical instruments as required by the main surgeon. The study uses a manipulator robot that automatically detects and analyzes explicit and implicit requests during surgery, emulating a human nurse handling surgical equipment. This project constitutes an important step in a research project that involves other challenges related to operative efficiency and safety.
At the 2016 QF Annual Research Forum Conference, we would like to present our preliminary results in the execution of the project. First, a description of the methodology used to acquire surgical team interactions during several cardiothoracic procedures observed at the HMC Heart Hospital, followed by the analysis of the data acquired. Secondly, experimental results of actual human-robot interaction tests emulating the human nurse behavior.
Methods
In order to study the interactions at the operating room during surgical procedures, a model of analysis was structured and executed for several cardiothoracic operations captured with MS Kinect V2 sensors. The data obtained was meticulously studied and relevant observations stored in a database to facilitate the analysis and comparison of events representing the different interactions among the surgical team.
Surgical Annotations
Two or three consecutive events identify in time a sequence of manipulation. For the purpose of developing a structure of annotations, each record in the database can be divided on information containing the time of occurrence of the event counted from the beginning of the procedure, information describing how the manipulation event occurs, information related to the position of the instrument in the space around the patient, and a final optional component with brief additional information that might help to understand the event considered and its relations to the surgical operations flow.
Figure 1: Operating room at HMC Heart Hospital. (a) Kinetic Sensor location (b) Surgical team and instrument locations for a cardio thoracic procedure as viewed from the sensor
1.1. Information containing the time of occurrence of the sequence
Timing information of sequences is basically described by time stamps corresponding to the occurrence of its initial and final events. Some special sequences might include an additional intermediate event that we call as ‘Ongoing’. Additionally, all events are counted as they occur. The status of this counting process is also included as a field in the time occurrence group.
1.2. Information describing how the manipulation sequence occurs
A careful observation of the routines performed at the surgical room allowed us to identify different sequences of events that can be classified into three general categories that describe how the manipulation event occurs:
Commands that correspond to the requests of instruments or operations addressed to supporting staff. These requests can be discriminated as verbal, non-verbal or a combination of both. Commands are not exclusively made by surgeons, sometimes the nurse handling instruments requests actions from the circulatory staff too.
Predictions made by the supporting staff when selecting and preparing instruments in advance to handle them to the surgeon, Fig. 3. These predictions can be divided into right, or wrong depending on the surgeon's decision to accept or reject the instrument when it is offered to him. Sometimes an instrument whose use was predicted incorrectly at a given time might be required by the surgeon in a near following sequence. We classified this kind of event as a partially wrong prediction, Fig. 4.
Actions that correspond to independent or coupled sequences necessary for the flow of the surgical procedure. For instance, as illustrated in the start and end events of Fig. 2, the surgeon picks up himself an instrument from the Mayo tray. The detailed observation of all relevant actions is essential to understand how commands are delivered, what intermediate events are triggered in response, and how the instruments are handled in space and time between its original and final location.
The information presented in the Table 1 summarizes the most common sequences of events found during the surgical procedures analyzed. The table also shows how the roles of the surgical team are distributed in relation to the events considered.
1.3. Information related to the instrument and its position in the space around the patient
The instrument is identified simply by its name. Several instances of the same instrument are used during surgery, but for annotation purposes we refer to all of them as if only one were available. In cases where some physical characteristic differentiate the instrument from others of the same kind such as in the case of size, a different instrument name is selectable. In Table 2, for example, a ‘Retractor’ is differentiated from ‘Big Retractor’. An instrument can be located at any of the possibilities listed under the label ‘Area’ in Table 2 as it can be verified from Fig. 1. In case, one of the members of the surgical team holds the instrument in one or both of his hands, the specification of the exact situation can be obtained by selecting any of the options under de label ‘Hands’ in Table 2.
1.4. Additional information
Any remarkable information related to the event can be included in this unlimited field. For example, at some point the nurse can offer two instruments simultaneously to the surgeon. This is a rare situation since usually the exchange is performed instrument by instrument.
Figure 2: Example of an action: The surgeon picks up directly an instrument
Figure 3: The nurse anticipates the use of one instrument
Figure 4: A Partially wrong prediction: One of two instruments is accepted
Table 1 Description of events and relations to the roles of surgical staff
Table 2 Information about location of the instrument
2.Annotation Software Tool
Based on libraries and information provided by Microsoft we wrote code in order to use MS Kinect Studio to annotate the surgical procedures. The use of Kinect Studio has several advantages if compared to other tools we evaluated, such as extreme precision to identify the length and timing of a sequence, and efficiency in the analysis of simultaneous streams of information. Figure 3 shows the screen presented by Kinect Studio when annotations are being made for the color stream of the surgical recording used as example in the same illustration. The color stream is rendered at a speed of 30fps, which means that every 0.03S in average there is a frame available to annotate if necessary. The blue markers on the Timeline are located at events of interest. On the left side of the screen, a set of fields that correspond to the information of interest is displayed to be filled for each event of interest.
Figure 5: Annotations in Kinect Studio are introduced as metadata fields for the Timeline Marker
The collection of the different entries to describe the interaction are written as an output text file that can be processed with conventional database software tools. The structure of the records in the resultant text file, is presented in Fig. 6. The set of annotations obtained within MS Kinect Studio is exported as a text table that follows the model illustrated in Fig. 6. The structure of the record presented contains the events and relations to the roles of surgical staff listed in Table 1 as well as the fields of information for the instrument as presented in Table 2.
Figure 6: Structure of the annotation record obtained from the metadata associated to the timeline markers in Kinect Studio
Figure 7: Annotations Database as processed within Excel for a surgery of 30 minutes
The annotations database obtained within Kinect Studio for the surgical procedure1 used as example in this report was exported to MS Excel for analysis. A partial image of this database is presented in Fig. 7 where it is possible to appreciate some of the first sequences stored. The colors in the third column are used to differentiate events that belong to the same sequence. These colors are chosen arbitrarily. After the final event of the sequence is identified, the same color is available to signal a new sequence. In total the database of this example contains 259 records for a period of 30 minutes. Queries performed by using Database functionalities generate the results for predictions and commands illustrated in Fig. 8 and Fig. 9.
Figure 8 Predictions: (a) Discrimination as right, wrong or partially wrong. (b) Instruments received by the surgeon (c) Nurse hand holding instruments (d) Instruments rejected by the surgeon (e) Time elapsed while the prediction is performed.
Figure 9 Commands: (a) Discrimination as verbal, nonverbal or combination of both verbal and nonverbal (b) Instruments requested (c) Time elapsed in verbal commands (d) Time elapsed in nonverbal commands (e) Time elapsed while the instrument is requested
Experimental Setup
As a preliminary step to operate a manipulator robot as robotic nurse, surgical personnel at the HMC Heart Hospital are requested to perform a mock knot as illustrated in the Fig. 10 on a synthetic model. During the performance of this task, a Kinect sensor captures body position, hand gestures as well as voice commands. This information is processed by a workstation running windows compatible software that controls the robot to react passing the requested surgical instrument to the subject so that it can be used to complete the task.
Figure 10: (a) Mock knot used as preliminary test of interaction (b) Robotic Set up for experiments at the HMC Heart Hospital
The robot used is a Barrett Robotic manipulator with seven degrees of freedom with as shown in Fig. 10. This FDA approved robot is one of the most advanced robotic systems known as safe to operate with human subjects since it has force sensing capabilities that are used to avoid potential impacts.
Summary
As part of development of a NPRP project that studies the feasibility of having robotic nurses at the operating room that can recognize verbal and nonverbal commands to deliver instruments from the tray to the hand of the surgeon, we have studied interaction activities of surgical teams performing cardio thoracic procedures at the Heart Hospital in Doha. Using state of the art sensor devices we achieved to capture plenty of information that has been carefully analyzed and annotated into databases. We would like to present at the 2016 QF Annual Research Forum Conference our current findings as well as the results of Human interaction tests with a manipulator robot acting as a robotic nurse in the execution of a task that involves gesture/verbal recognition, recognition of the instrument and safe delivery to the surgeon.
1 Wedge Lung Resection. In this procedure the surgeon removes a small wedge-shaped piece of lung that contains cancer and a margin of healthy tissue around the cancer.
-
-
-
Design and Performance Analysis of VLC-based Transceiver
Authors: Amine Bermak and Muhammad Asim AttaBackground
As the number of handheld devices increases, wireless data traffic is expanding exponentially. With the ever increasing demand of higher data rate, it will be very challenging for the system designers to meet the requirement using limited Radio Frequency (RF) communication spectrum. One of the possible remedies to overcome this problem is the use of freely available visible light spectrum [1].
Introduction
This paper proposes an indoor communication system by utilizing Visible Light Communication (VLC). VLC technology utilizes visible light spectrum (750–380 nm) not only for illumination but with an additive advantage of data communication [2]. Visible Light Communication exploits high frequency switching capabilities of Light Emitting Diodes (LEDs) to transmit data. A receiver generally containing a Photo Diode receives signals from optical source and can easily decode the information being transmitted. In practical systems, usage of CMOS imager as a receiver containing an array of Photo Diodes is preferred over a single Photo Diode. Such receiver will enable multi-target detection and multi-channel communication resulting in more robust transceiver architecture [3].
Method
This work demonstrates a real-time transceiver implementation for Visible Light Communication on FPGA. A Pseudo Noise (PN) sequence is generated that will act as input data for the transmitter. Direct Digital Synthesizer (DSS) is implemented for the generation of carrier signal for modulation purpose [4]. Transmitter utilizes On-Off-Keying (OOK) for modulation of incoming data due to its simplicity [5]. The modulated signal is then converted into analog form using Digital-to-Analog (DAC) converter. An analog driver circuit is connected with digital transmitter which is capable of driving an array of Light Emitting Diodes (LEDs) for data transmission. Block level architecture of VLC Transmitter is shown in Fig. 1.
The receiver architecture uses analog circuitry including Photo Diodes for optical detection and Operational Amplifiers for amplification of received signal. Analog-to-Digital (ADC) conversion is performed before transmitting data back to FPGA for demodulation and data reconstruction. Figure 2 demonstrates the architecture of VLC Receiver.
Results and Conclusion
The system is implemented and tested using Xilinx Spartan 3A series FPGA [6]. Basic transceiver implementation utilizes data rate of 1Mbps with a carrier frequency of 5 MHz. However, in VLC, data rate and carrier frequency directly affects the optical characteristics including color and intensity of LEDs. Therefore, different data rates and modulation frequencies are evaluated for optimum data transmission with minimal effects on optical characteristics of LEDs. System complexity in terms of hardware resources and performance analysis including Bit Error Rate (BER) under varying conditions is also compared.
Results demonstrate that it is feasible to establish a low data rate communication link for indoor applications ranging up to 10 m using commercially available LEDs. Integrating a CMOS imager at receiver end will enable a VLC based Multiple-Input-Multiple-Output (MIMO) communication link that can serve multiple channels, maximizing to 1 channel per pixel [3]. Higher data rates are also achievable by utilizing high data rate modulation techniques (OFDM) at the expense of computational complexity and hardware resource utilization [7].
One of the possible implications of this work could be the implementation of VLC based Indoor positioning and navigation system. It can be a potential benefit for large constructions involved in public interactions including but not limited to hospitals, customer support centers, public facilitation offices, shopping malls and libraries. The system will largely utilize existing infrastructure of indoor illumination with added advantage of data communication.
The study also proposes extension of this work for utilization of VLC in outdoor applications. However, more robust algorithms are required for outdoor communication due to the presence of optical noise and interference caused by weather and atmospheric conditions. Robustness of existing algorithm can be increased by integrating Direct Sequence Spread Spectrum (DSSS) together with OOK for modulation. However, further research work is required to evaluate the performance, complexity and robustness of this system under realistic conditions.
References
[1]Cisco Visual Networking Index, “Global Mobile Data Traffic Forecast Update, 2012–2017,” CISCO, White Paper, Feb. 2013.
[2]Terra, D. Inst. de Telecomun., Univ. de Aveiro, Aveiro, Portugal, Kumar, N. Lourenco, N. Alves, L.N.,” Design, development and performance analysis of DSSS-based transceiver for VLC”, EUROCON - International Conference on Computer as a Tool (EUROCON), 2011 IEEE.
[3]“Image Sensor Communication”. VLCC Consortium.
[4]Xilinx DDS Compiler IP Core “http://www.xilinx.com/products/intellectual property/
dds_compiler.html#documentation”
[5]Nuno Lourenço, Domingos Terra, Navin Kumar, Luis Nero Alves, Rui L Aguiar, “Visible Light Communication System for Outdoor Applications”, 8th IEEE, IET International Symposium on Communication Systems, Networks and Digital Signal Processing
[6]Xilinx Spartan-3A Starter Kit “http://www.xilinx.com/products/boards-and-kits/
hw-spar3a-sk-uni-g.html”
[7]Liane Grobe, Anagnostis Paraskevopoulos, Jonas Hilt, Dominic Schulz, Friedrich Lassak, Florian Hartlieb, Christoph Kottke, Volker Jungnickel, and Klaus-Dieter Langer, “High Speed Visible Light Communication Systems”, IEEE Communications Magazine, December 2013.
-
-
-
A Robust Unified Framework of Vehicle Detection and Tracking for Driving Assistance System with High Efficiency
Authors: Amine Bermak and Bo ZhangBackground
Research by Qatar Road Safety Studies Center (QRSCC) found that the total number of traffic accidents in Qatar was 290,829 in 2013, with huge economical cost that amounts to 2.7 percent of the country's gross domestic product (GDP). There is a growing research effort to improve road safety and to develop automobile driving assistance systems or even self-driving systems like Google project, which is widely expected to revolutionize automotive industry. Vision sensors will play a prominent role in such applications because they provide intuitive and rich information about the road condition. However, vehicle detection and tracking based on vision information is a challenging task because of the large variability of appearance of vehicles, interference of strong light and sometimes fierce weather condition, and complex interactions amongst drivers.
Objective
While previous work usually regards vehicle detection and tracking as separate tasks [1, 2], we propose a unified framework for both tasks. In the detection phase, recent work has mainly focused on building detection systems based on robust feature sets such as histograms of oriented gradients (HOG) [3] and Harr-like features [4] rather than just simple features such as symmetry or edges. However, these robust features involve heavy computational requirements. In this work, we propose an algorithmic framework designed to target both high efficiency and robustness while keeping the computational requirement at an acceptable level.
Method
In the detection phase, in order to reduce the processing latency, we propose to use a hardware-friendly corner detection method obtained from accelerated segment test feature (FAST) [5], which determine interest corners by simply comparing each pixel with its 9 neighbors. If there are contiguous neighboring pixels that are all brighter or darker than a center pixel, it is marked as a corner point. Fig.1 shows the result of FAST corner detector on a real road image. We use recent Learned Arrangements of Three Patch Codes (LATCH) [6] as corner point descriptor. The descriptor is falls into binary descriptor category, but still maintains high performance comparable to histogram based descriptors (like HOG). The descriptors created by LATCH are binary strings, which are computed by comparing image patch-triplets rather than image pixels and as a result, they are less sensitive to noises and minor changes in local appearances. In order to detect vehicles, corners in the successive images are matched to those in the previous images, and thus optical flow at each corner point can be derived according to the movement of corner points. Because of the fact that approaching vehicles in opposite direction will produce a diverging flow, vehicles can be detected from the flow due to ego-motion. Fig.2 illustrates the flow estimated from corner point matching. Sparse optical flow proposed here is quite robust because of the LATCH characteristics, and it also requires much lower computational resources compared to traditional optical flow methods that need to solve time-consuming optimization problem.
Once vehicles are detected, the tracking phase is achieved by matching the corner points. Using Kalman filter for prediction, the matching is fast because probable matched corner point will only be searched near the predicted location. Using corner points for computing sparse optical flow enables the vehicle detection and tracking to be carried-out simultaneously using this unified framework (Fig.3). In addition, this framework allows us to detect newly entered cars in the scene during tracking. Since most image sensors today are based on a rolling shutter integration approach, the image information can be transmitted to the FPGA-based hardware serially and hence the FAST detector and LATCH descriptor could work in a pipeline manner for achieving efficient computation.
Conclusion
In this work, we propose a framework of detecting and tracking vehicles for driving assistance application. The descriptors created by LATCH are binary strings, which are computed by comparing image patch-triplets rather than image pixels and as a result, they are less sensitive to noises and minor changes in local appearances. The vehicles are detected from sparse flow estimated from corner point matching and vehicle tracking is done also with corner point matching with the assistance of Kalman filter. The proposed framework is robust, efficient and requires much lower computational requirements making it a very viable solution for embedded vehicle detection and tracking systems.
References
[1] S. Sivaraman and M. M. Trivedi, “Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis,” IEEE Trans. Intell. Transp. Syst., vol. 14, no. 4,
pp. 1773–1795, 2013.
[2] Z. Sun, G. Bebis, and R. Miller, “On-Road Vehicle Detection: A Review,” vol. 28, no. 5,
pp. 694–711, 2006.
[3] Z. Sun, G. Bebis, and R. Miller, “Monocular precrash vehicle detection: Features and classifiers,” IEEE Trans. Image Process., vol. 15, no. 7, pp. 2019–2034, 2006.
[4] W. C. Chang and C. W. Cho, “Online boosting for vehicle detection,” IEEE Trans. Syst. Man, Cybern. Part B Cybern., vol. 40, no. 3, pp. 892–902, 2010.
[5] E. Rosten and T. Drummond, “Fusing points and lines for high performance tracking,” Tenth IEEE Int. Conf. Comput. Vis. Vol. 1, vol. 2, pp. 1508–1515 Vol. 2, 2005.
[6] G. Levi and T. Hassner, “LATCH: Learned Arrangements of Three Patch Codes,” arXiv, 2015.
-
-
-
On Arabic Multi-Genre Corpus Diacritization
Authors: Houda Bouamor, Wajdi Zaghouani, Mona Diab, Ossama Obeid, Kemal Oflazer, Mahmoud Ghoneim and Abdelati HawwariOne of the characteristics of writing in Modern Standard Arabic (MSA) is that the commonly used orthography is mostly consonantal and does not provide full vocalization of the text. It sometimes includes optional diacritical marks (henceforth, diacritics or vowels).
Arabic script consists of two classes of symbols: letters and diacritics. Letters comprise long vowels such as A, y, w as well as consonants. Diacritics on the other hand comprise short vowels, gemination markers, nunation markers, as well as other markers (such as hamza, the glottal stop which appears in conjunction with a small number of letters, dots on letters, elongation and emphatic markers) which in all, if present, render a more or less exact precise reading of a word. In this study, we are mostly addressing three types of diacritical marks: short vowels, nunation, and shadda (gemination).
Diacritics are extremely useful for text readability and understanding. Their absence in Arabic text adds another layer of lexical and morphological ambiguity. Naturally occurring Arabic text has some percentage of these diacritics present depending on genre and domain. For instance, religious text such as the Quran is fully diacritized to minimize chances of reciting it incorrectly. So are children's educational texts. Classical poetry tends to be diacritized as well. However, news text and other genre are sparsely diacritized (e.g., around 1.5% of tokens in the United Nations Arabic corpus bear at least one diacritic (Diab et al., 2007)).
In general, building models to assign diacritics to each letter in a word requires a large amount of annotated training corpora covering different topics and domains to overcome the sparseness problem. The currently available diacritized MSA corpora are generally limited to the newswire genres (those distributed by the LDC) or religion related texts such as Quran or the Tashkeela corpus. In this paper we present a pilot study where we annotate a sample of non-diacritized text extracted from five different text genres. We explore different annotation strategies where we present the data to the annotator in three modes: basic (only forms with no diacritics), intermediate (basic forms–POS tags), and advanced (a list of forms that is automatically diacritized). We show the impact of the annotation strategy on the annotation quality.
It has been noted in the literature that complete diacritization is not necessary for readability Hermena et al. (2015) as well as for NLP applications, in fact, (Diab et al., 2007) show that full diacritization has a detrimental effect on SMT. Hence, we are interested in discovering the optimal level of diacritization. Accordingly, we explore different levels of diacritization. In this work, we limit our study to two diacritization schemes: FULL and MIN. For FULL, all diacritics are explicitly specified for every word. For MIN, we explore what is a minimum and optimal number of diacritics that needs to be added in order to disambiguate a given word in context and make a sentence easily readable and unambiguous for any NLP application.
We conducted several experiments on a set of sentences that we extracted from five corpora covering different genres. We selected three corpora from the currently available Arabic Treebanks from the Linguistic Data Consortium (LDC). These corpora were chosen because they are fully diacritized and had undergone significant quality control, which will allow us to evaluate the anno tation accuracy as well as our annotators understanding of the task. We select a total of 16,770 words from these corpora for annotation. Three native Arabic annotators with good linguistic background annotated the corpora samples. Diab et al. (2007), define six different diacritization schemes that are inspired by the observation of the relevant naturally occurring diacritics in different texts. We adopt the FULL diacritization scheme, in which all the diacritics should be specified in a word. Annotators were asked to fully diacritize each word.
The text genres were annotated following the different strategies:
- Basic: In this mode, we ask for annotation of words where all diacritics are absent, including the naturally occurring ones. The words are presented in a raw tokenized format to the annotators in context.
- Intermediate: In this mode, we provide the annotator with words along with their POS information. The intuition behind adding POS is to help the annotator disambiguate a word by narrowing down on the diacritization possibilities.
- Advanced: In this mode, the annotation task is formulated as a selection task instead of an editng task. Annotators are provided with a list of automatically diacritized candidates and are asked to choose the correct one, if it appears in the list. Otherwise, if they are not satisfied with the given candidates, they can manually edit the word and add the correct diacritics. This technique is designed in order to reduce annotation time and especially reduce annotator workload. For each word, we generate a list of vowelized candidates using MADAMIRA (Pasha et al., 2014). MADAMIRA is able to achieve a lemmatization accuracy 99.2% and a diacritization accuracy of 86.3%. We present the annotator with the top three candidates suggested by MADAMIRA, when possible. Otherwise, only the available candidates are provided.
We also provided annotators with detailed guidelines, describing our diacritization scheme and specifying how to add diacritics for each annotation strategy. We described the annotation procedure and specified how to deal with borderline cases. We also provided in the guidelines many annotated examples to illustrate the various rules and exceptions.
In order to determine the most optimized annotation setup for the annotators, in terms of speed and efficiency, we test the results obtained following the three annotation strategies. These annotations are all conducted for the FULL scheme. We first calculated the number of words annotated per hour, for each annotator and in each mode. As expected, following the Advanced mode, our three annotators could annotate an average of 618.93 words per hour which is double those annotated in the Basic mode (only 302.14 words). Adding POS tags to the Basic forms, as in the Intermediate mode, does not accelerate the process much. Only − 90 more words are diacritized per hour compared to the basic mode.
Then, we evaluated the Inter-Annotator Agreement (IAA) to quantify the extent to which independent annotators agree on the diacritics chosen for each word. For every text genre, two annotators were asked to annotate independently a sample of 100 words.
We measured the IAA between two annotators by averaging WER (Word Error Rate) over all pairs of words. The higher the WER between two annotations, the lower their agreement. The results obtained show clearly that the Advanced mode is the best strategy to adopt for this diacritization task. It is the less confusing method on all text genres (with WER between 1.56 and 5.58).
We also conducted a preliminary study for a minimum diacritization scheme. This is a diacritization scheme that encodes the most relevant differentiating diacritics to reduce confusability among words that look the same (homographs) when undiacritized but have different readings. Our hypothesis in MIN is that there is an optimal level of diacritization to render a text unambiguous for processing and enhance its readability. We showed the difficulty in defining such a scheme and how subjective this task can be.
Acknowledgement
This publication was made possible by grant NPRP-6-1020-1-199 from the Qatar National Research Fund (a member of the Qatar Foundation).
-
-
-
QUTor: QUIC-based Transport Architecture for Anonymous Communication Overlay Networks
Authors: Raik Aissaoui, Ochirkhand Erdene-Ochir, Mashael Al-Sabah and Aiman ErbadIn this new century, the growth of Information and Communication Technology (ICT) has a significant influence on our life. The wide spread of internet created an information society where the creation, distribution, use, integration and manipulation of information is a significant economic, political, and cultural activity. However, it has also brought its own set of challenges. Internet users have become increasingly vulnerable to online threats like botnets, Denial of Service (DoS) attacks and phishing spam mail. Stolen users’ information can be exploited by many third party entities. Some Internet Service Provider (ISP) sell this data to advertising companies which analyse it and build its marketing strategy to influence the customer choices by breaking their privacy. Oppressive governments exploit revealed users private data to harass members of the opposition parties, activist from civil society and journalists. Anonymity networks has been introduced in order to allow people to conceal their identity online. This is done by providing unlinkability between the user IP address, his digital fingerprint, and his online activities. Tor is the most widely used anonymity network today, serving millions of users on a daily basis using a growing number of volunteer-run routers [1]. Clients send their data to their destinations through a number of volunteer-operated proxies, known as Onion Routers (ORs). If a user wants to use the network to protect his online privacy, the user installs the Onion Proxy (OP), which bootstraps by contacting centralized servers, known as authoritative directories, to download the needed information about ORs serving in the network. Then, the OP builds overlay paths, known as circuits, which consist of three ORs-entry guard, middle and exit-where only the entry guard knows the user, and only the exit knows the destination. Tor helps internet users to hide their identities, however it introduces large and highly variable delays experienced in response and download times during web surfing activities that can be inconvenient for users. Traffic congestion adds further delays and variability to the performance of the network. Besides, an end-to-end flow control approach which does not react to congestion in the network.
To improve Tor performance, we propose to integrate QUIC for Tor. QUIC [2] (Quick UDP Internet Connections) is a new multiplexed and secure transport atop UDP, developed by Google. QUIC is implemented over UDP to solves a number of transport-layer and application-layer problems experienced by modern web applications. It reduces connection establishment latency. QUIC handshakes frequently require zero roundtrips before sending payload. It improves congestion control and multiplexes without head-of-line blocking. QUIC is designed for multiplexed streams, lost packets carrying data for an individual stream generally only impact that specific stream. In order to recover from lost packets without waiting for a retransmission, QUIC can complement a group of packets with an Forward Error Correction (FEC) packet. QUIC connections are identified by a 64-bit connection identification (ID). When a QUIC client changes Internet Protocol (IP) addresses, it can continue to use the old connection ID from the new IP address without interrupting any in-flight requests. QUIC provides multiplexing and flow control equivalent to HTTP/2, security equivalent to TLS, and connection semantics, reliability, and congestion control equivalent to TCP. QUIC shows a good performance against HTTP/1.1 [3]. We are expecting good results to improve the performance of Tor since QUIC is one of the most promising solutions to decrease latency [4]. A QUIC Stream is a bi-directional flow of bytes across a logical channel within a QUIC connection. This later is a conversation between two QUIC endpoints with a single encryption context that multiplexes streams within it. QUIC multiplestream architectures improves Tor performance and solves head-of-line problem. In first step, we implemented QUIC in OR nodes to be easily upgraded to the new architecture without modifying end user OP. Integrating QUIC will not degrade Tor security as it provides a security equivalent to TLS (QUIC Crypto) and soon it will use TLS 1.3.
-
-
-
Cognitive Dashboard for Teachers Professional Development
Authors: Riadh Besbes and Seifeddine BesbesIntroduction
This research aims to enhance the culture of data in education which is in the middle of a major transformation by technology and Big Data Analytics. The core purpose of schools is providing an excellent education to every learner; data can be the leverage of that mission. Big data analytics is the process of examining large sets containing a variety of data types to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Valuable lessons can be learnt from other industries when considered in terms of their practicality for public education. Hence, Big Data Analytics, also known as Education Data Mining and Learning Analytics, develop capacity for quantitative research in response to the growing need for evidence-based analysis related to education policy and practice. However, education has been slow to follow the Data Analytics evolution due to difficulties surrounding what data to collect, how to collect those data and what they might mean. Our research works identify, quantify, and measure the qualitative teaching practices, the learning performances, and track the learners' academic progress. Teaching and learning databases are accumulated from quantitative “measures” done through indoors classroom visits within academic institutions, online web access learners' questionnaires answers, paper written statements' analysis of academic exams in mathematics, science, and literacy disciplines, and online elementary grades seizure from written traces of learners' performances within mathematics, science, and literacy exams. The project's data mining strategy will support and develop teachers' expertise, enhance and scaffold students' learning, improve and raise education system's performance. The supervisor expertise will mentor the researcher for information and educational knowledge extraction from collected data. As consequence, the researcher will acquire the wisdom of knowledge use to translate it into more effective training sessions on educational concrete policies.
State-of-the-art
Anne Jorro says: “evaluate is necessarily considering how we will support, advise, exchange, to give recognition to encourage the involvement of the actor giving him the means to act”. PISA report states that many of the world's best-performing education systems have moved from bureaucratic “command and control” environments towards school systems in which the people at the frontline have much more control. Making teaching and learning data available leads to information then knowledge extraction. As advised by PISA report, the effective use of extracted knowledge drives decision making to wisdom. Linda Darling-Hammond and Charles E. Ducommun underscore the important assumption that, undoubtedly, teachers are the fulcrum that has the biggest impact and makes any school initiative leads toward success or failure. Rivkin and al ensure that a teacher's classroom instructional practice is perhaps one of the most important yet least understood factors contributing to teacher effectiveness. As consequence, many classroom observation tools are designed to demystify effective teaching practices. The Classroom Assessment Scoring System is a well-respected classroom climate observational system. The CLASS examines three domains of behaviour including, firstly, emotional support (positive classroom climate, teacher sensitivity, and regard for student perspectives). Secondly, it includes classroom organization (effective behaviour management, productivity, and instructional learning formats). Thirdly, it contains instructional supports (concept development, quality of feedback, and language modelling). The Framework for Teaching Method for Evaluation Classroom Observation is lastly releasing as 2013 edition. It divides the complex activity of teaching into 22 components clustered into four domains of teaching responsibility. This last tool's edition was conceived to respond to the instructional implications of the American Common Core State Standards. Those standards envision, for literacy and mathematics initially, deep engagement by students with important concepts, skills, and perspectives. They emphasize active, rather than passive, learning by students. In all areas, they place a premium on deep conceptual understanding, thinking and reasoning, and the skill of argumentation. Heather Hill from Harvard University, and Deborah Loewenberg Ball from university of Michigan, had developed “Mathematical Quality of Instruction (MQI)”. Irving Hamer is an education consultant and a former deputy superintendent for academics, technology, and innovation for school system.
Objectives
Our project's wider objective is to improve teaching and learning effectiveness within K12 classes by exploiting data mining methods for educational knowledge extraction. The researcher realizes three daily visits to mathematics, science, and literacy courses. Using his interactive educational grid, an average of 250 numerical data were stored as quantified teaching and learning practices for one classroom visit for every teacher. At the same time through, and in parallel with on-field activities, distance interactivity via website is processed. At the beginning and for once, each learner from planned classes to be visited fills an individual questionnaire form for learning style identification. He seizes, on another website form, every elementary grade on each question from his maths, science and literacy exams' answer sheets. Those exams statements were previously analysed and saved by the researcher on website. Averages of 150 numerical data were stored as quantified learning performances for every learner. Meetings at partner University for data analytics and educational knowledge extraction were done followed by meetings at inspectorate headquarters for in-depth data. Then, in partner schools, training sessions were the theatres of constructive reflections and feedbacks on major findings about teaching and learning effectiveness. Those actions were reiterated for months. Each year, averagely the performance of 1000 students and the educational practices of 120 teachers will be specifically and tracked. Within summer's months, workshops, seminars, and an international conference will be organised for stakeholders from educational fields. Thus, among project's actions three specific objectives shall be achieved. First, sufficient data on students' profiles and performances related to educational weaknesses and strengths will be provided. Second, teachers' practices inside classrooms at each partner school will be statistically recorded. Third, a complete data mining centre for educational research will be conceived and cognitively interpreted by researchers' teams then findings are exposed for teachers' reflexive thoughts, and discussions within meetings and training sessions
Research methodology and approach
-
-
-
Dynamic Scheduled Access Medium Access Control for Emerging Wearable Applications
Authors: Muhammad Mahtab Alam, Dhafer Ben-Arbia and Elyes Ben-HamidaContext and Motivation
Wearable technology is emerging as one of the key enablers for the internet of everything (IoE). The technology is getting mature by every day with more applications than ever before consequently making a significant impact in consumer electronic industry. In recent years, with the continuous exponential rise, it is anticipated that by 2019 there will be more than 150 million wearable devices worldwide [1]. Whilst fitness and health-care remain the dominant wearable applications, other applications include fashion and entertainment, augmented reality, rescue and emergency management are emerging as well [2]. In this context, Wireless Body Area Networks (WBAN) is implicit and well-known research discipline which foster and contribute towards the rapid growth of wearable technology. IEEE 802.15.6 standard targeted for WBAN provides a great flexibility and provisions both at the physical (PHY) and medium access control (MAC) layers [3].
The wearable devices are constraint by limited battery, miniaturized, low processing and storage capabilities. While energy efficiency remains one of the most important challenges, low duty cycle and dynamic MAC layer design is critical for the longer life of these devices. In this regard, scheduled access mechanism is considered as one of the effective MAC approaches in WBAN in which every sensor node can have a dedicated time slot to transfer its data to the BAN coordinator. However, for a given application, as every node (i.e., connected sensors) has different data transmission rates [4], therefore, the scheduled access mechanism has to adapt the slot allocation accordingly to meet the design constraints (i.e., energy efficiency, packet delivery and delay requirements).
Problem Description
The scheduled access MAC with 2.4 GHz of operating frequency, highest data rate (i.e., 971 Kbps), and highest payload (i.e., 256 bytes) provides the maximum throughput in IEEE 802.15.6 standard. However, the performance of both packet delivery ratio (PDR) and delay in this configuration is very poor starting from -10dBm and lower transmission power [5]. The presented study is focused on this particular PHY-MAC configuration and to understand what is the maximum realistic achievable throughput while operating at the lowest transmission power for future IEEE 802.15.6 compliant transceivers. In addition the objective is to enhance the performance under realistic mobility patterns i.e., space and time varying channel conditions.
Contribution
In this paper we address the reliability concern of the above mentioned wearable applications while using IEEE 802.15.6 (high data rate supported) PHY-MAC configuration. The objective is to enhance the system performance while exploiting m-periodic scheduled access mechanism. We proposed a throughput and channel aware dynamic scheduling algorithm which provides a realistic throughput under dynamic mobility and space and time varying links. First, various mobility patterns are generated with special emphasis on space and time varying links because their performance is most vulnerable under the dynamic environment. A deterministic pathloss values (as an estimate of the channel) are obtained from a motion capture system and bio-mechanical modeling. Consequently, signal to noise (SNR), bit error rate (BER) and packet error rate (PER) are calculated. The proposed algorithm during the first phase uses this estimated PER to select the potential nodes for a time slot. Whereas in the second phase, based on the nodes priority and the data packets availability among the potential candidates, finally a slot is assigned to one node. This process is iterated by the coordinating node until the end of a super frame.
Results
The proposed scheduling scheme has a significant gain over a reference scheme (i.e., without dynamic adaptation). On average, 20-to-55 percent extra packets are received, along with 1-to-5 joules of energy savings though at the cost of higher delay ranging from 20-to-200 ms while operating at low power levels (i.e., 0 dBm, -5 dBm, -10 dBm). It is recommended that the future wearable IEEE 802.15.6 compliant transceivers can successfully operate at -5 dBm to -8 dBm of transmission power; further reducing the power levels under dynamic environment can degrade the performance. It is also observed that the achievable throughput of different time varying links is good under realistic conditions until the data packet generation rate is higher than 100 ms. Acknowledgment: The work was supported by NPRP grant #[6-1508-2-616] from the Qatar National Research Fund which is a member of Qatar Foundation. The statements made herein are solely the responsibility of the authors.
References
[1] “Facts and statistics on Wearable Technology,” 2015. [Online]. Available: http://www.statista.com/topics/1556/wearable-technology/.
[2] M. M. Alam and E. B. Hamida, “Surveying Wearable Human Assistive Technology for Life and Safety Critical Applications: Standards, Challenges and Opportunities,” MDPI Journal on Sensors, vol. 14, no. 5, pp. 9153–9209, 2014.
[3] “802.15.6-2012 - IEEE Standard for Local and metropolitan area networks - Part 15.6: Wireless Body Area Networks,” 2012. [Online]. Available: https://standards.ieee.org/findstds/standard/802.15.6-2012.html.
[4] M. M. Alam and E. B. Hamida, “Strategies for Optimal MAC Parameters Tuning in IEEE $802.15.6$ Wearable Wireless Sensor Networks,” Journal of Medical Systems, vol. 39, no. 9, pp. 1–16, 2015.
[5] M. Alam and E. BenHamida, “Performance evaluation of IEEE 802.15.6 MAC for WBSN using a space-time dependent radio link model,” in IEEE 11th AICCSA Conference, Doha, 2014.
-
-
-
Real-Time Location Extraction for Social-Media Events in Qatar
1.Introduction
Social media gives us instant access to a continuous stream of information generated by users around the world. This enables real-time monitoring of users’ behavior (Abbar et al., 2015), events’ life-cycles (Weng and Lee. 2010), and large-scale analysis of human interactions in general. Social media platforms are also used to propagate influence, spread content, and share information about events happening in real-time. Detecting the location of events directly from user-generated text can be useful in different contexts, such as humanitarian response, detecting the spread of diseases, or monitoring traffic. In this abstract, we define a system that can be used for any of the purposes described above, and illustrate its usefulness with an application for locating traffic-related events (e.g., traffic jams) in Doha.
The goal of this project is to design a system that, given a social-media post describing an event, predicts whether or not the event belongs to a specific category (e.g., traffic accidents) within a specific location (e.g. Doha). If the post is found to belong to the target category, the system proceeds with the detection of all possible mentions of locations (e.g. “Corniche”, “Sports R/A”, “Al Luqta Street.”, etc.), landmarks (“City Center”, “New Al-Rayyan gas station”, etc.), and location expressions (e.g. “On the Corniche between the MIA park and the Souq)”. Finally, the system geo-localizes (i.e. assigns latitude and longitude coordinates) to every location expression used in the description of the event. This makes it useful for placing the different events onto a map; a downstream application will use these coordinates to monitor real-time traffic, and geo-localize traffic-related incidents.
2.System Architecture
In this section we present an overview of our system. We first describe its general “modular” architecture, and then proceed with the description of each module.
2.1. General view
The general view of the system is depicted in Figure 1. The journey starts by listening to some social media platforms (e.g., Twitter, Instagram) to catch relevant social posts (e.g., tweets, check-ins) using a list of handcrafted keywords related to the context of the system (e.g., road traffic). Relevant posts are then pushed through a three-steps pipeline in which we double-check the relevance of the post using an advanced binary classifier (Content Filter). We then extract location names mentioned in the posts if any. Next, we geo-locate the identified locations to their accurate placement on the map. This process allow to filter undesirable posts, and augment the relevant once with precise geo-location coordinates which are finally exposed for consumption via a restful API. We provide below details on each of the aforementioned modules.
Figure 1: Data processing pipeline.
2.2. Content filter
The Content Filter consists of a binary classifier that, given a tweet deemed to be about Doha, decides whether the tweet is a real-time report about traffic in Doha or not. The classifier receives as input tweets that have been tweeted from a location enclosed in a geographic rectangle (or bounding box) that roughly corresponds to Doha, and that contain one or more keywords expected to refer to traffic-related events (e.g., “accident”, “traffic”, “jam”, etc.). The classifier is expected to filter out those tweets that are not real-time reports about traffic (e.g., tweets that mention “jam”’ as a type of food, tweets that complain about the traffic in general, etc.). We build the classifier using supervised learning technology; in other words, a generic learning process learns, from a set of tweets that have been manually marked as being either real-time reports about traffic or not, the characteristics that a new tweet should have in order to be considered a real-time report about traffic. For our project, 1000 tweets have been manually marked for training purposes. When deciding about a new tweet, the classifier looks for “cues” that, in the training phase, have been found to be “discriminative”, i.e., helpful in taking the classification decision. In our project, we used the Stanford Maximum Entropy Classifier (Manning and Klein, 2003) to perform the discriminative training. In order to generate candidate cues, the tweet is preprocessed via a pipeline of natural language analysis tools, including a social-media-specific tokenizer (O'Connor et al., 2010) which splits words, and a rule-based Named-Entity Simplifier which substitutes mentions of local entities by their corresponding meta-categories (for example, it substitutes “@moi_qatar” or “@ashghal” for “government_entity”).
2.3.NLP components
The Location Expression Extractor is a module that identifies (or extracts) location expressions, i.e., natural language expressions that denote locations (e.g., “@ the Slope roundabout”, “right in front of the Lulu Hypermarket”, “on Khalifa”, “at the crossroads of Khalifa and Majlis Al Taawon”, etc.). A location expression can be a complex linguistic object, e.g., “on the Corniche between the MIA and the underpass to the airport”. A key component of the Location Expression Extractor is the Location Named Entity Extractor, i.e., a module that identifies named entities of Location type (e.g. “the Slope roundabout”) or Landmark type (e.g., “the MIA”). For our purposes, a location is any proper name in the Doha street system (e.g., “Corniche”, “TV roundabout”, “Khalifa”, “Khalifa Street”); landmarks are different from locations, since the locations are only functional to the Doha street system, while landmarks have a different purpose (e.g., the MIA is primarily a museum, although its whereabouts may be used as a proxy of a specific location in the Doha street system – i.e., the portion of the Corniche that is right in front of it).
The Location Named Entity Extractor receives as input the set of tweets that have been deemed to be about some traffic-related event in Doha, and returns the same tweet where named entities of type Location or of type Landmark have been marked as such. We generate a Location Named Entity Extractor via (again) supervised learning technology. In our system, we used the Stanford CRF-based Named Entity Recognizer (Finkel et al., 2005) to recognize named entities of type Location or of type Landmark using a set of tweets where such named entities have been manually marked. From these “training” tweets the learning system automatically recognizes the characteristics that a natural language expression should have in order to be considered a named entity of type Location or of type Landmark. Again, the learning system looks for “discriminative” cues, i.e., features in the text that may indicate the presence of one of the sought named entities. To improve the accuracy over tweets, we used a tweet-specific tokenizer (O'Connor et al., 2010), a tweet-specific Part-of-Speech tagger (Owoputi et al., 2013) and an in-house gazetteer of locations related to Qatar.
2.4.Resolving location expression onto the map
Once location entities are extracted using the NLP components, we use the APIs of Google, Bing and Nominatim to request the geographic coordinates of the map location entities into geographic coordinates. Each location entity is geo-coded by the Google Geolocation API, Bing Maps REST API and Nomination gazetteer individually. We use multiple geo-coding sources to increase the robustness of our application, as a single API might fail to retrieve geo-coding data. Given a location entity, the result of the geo-coding retrieval is formatted as a JSON object containing the name of the location entity, its address, and the corresponding geo-coding results from Bing, Google or Nominatim. The geo-coding process is validated by comparing the results of the different services used. We first make sure that the location returned falls within Qatar's bounding box. We then compute the pairwise distance between the different geographic coordinates to ensure their consistency.
2.5.Description of the Restful API
In order to ease the consumption of the relevant geo-located posts and make it possible to integrate these posts in a comprehensive way with other platforms, we have built a Restful API. In the context of our system, this refers to using HTTP verbs (GET, POST, PUT) to retrieve relevant social posts stored by our back-end processing.
Our API exposes two endpoints: Recent and Search. The former endpoint provides an interface to request the latest posts identified by our system. It supports two parameters: Count (maximum number of posts to return) and Language (the language of posts to return i.e., English or Arabic.) The later endpoint enables querying the posts for specific keywords and return only posts matching them. This endpoint supports three parameters: Query (list of keywords), Since (date-time of the oldest post to retrieve), From-To (two date-time parameters to express the time interval of interest.) In the case of a road traffic application, one could request tweets about “accidents” that occurred in West-Bay since the 10th of October.
3.Target: single architecture for multiple applications
Our proposed platform is highly modular (see Figure 1). This guarantees that relatively simple changes in some modules can make the platform relevant to any applicative context where locating user messages on a map is required. For instance, the content classifier – the first filtering element in the pipeline – can be oriented to mobility problems in a city: accident or congestion reporting, road blocking or construction sites, etc. With the suitable classifier, our platform will collect traffic and mobility tweets, and geo-locate them when possible. However, there are many other contexts in which precise location is needed. For instance, in natural disaster management, it is well admitted that people involved in catastrophic events (floods, typhoons, etc.) use social media as a means to create awareness, demand help or medical attention (Imran et al., 2013). Quite often, these messages may contain critical information for relief forces, who may not have enough knowledge of the affected place and/or accurate information of the level of damage in buildings or roads. Often, the task to read, locate on a map and mark is crowd-sourced to volunteers; we foresee that, in such time-constrained situations, our proposed technology would represent an advancement. Likewise, the system may be oriented towards other applications: weather conditions, leisure, etc.
4.System Instantiation
We have instantiated the proposed platform to the problem of road traffic in Doha. Our objective is to sense in real-time the traffic status in the city using social media posts only. Figure 2 shows three widgets of the implemented system. First, the Geo-mapped Tweets Widget shows a Doha map with different markers: the yellow markers symbolize the tweets geo-located by the users, the red markers represent the tweets geo-located by our system; the large markers come from tweets that have an attached photo, while the small markers represent the text-only tweets. Second, the Popular Hashtags Widget illustrates hashtags mentioned by the users, where the large font size shows the most frequent one. Third, the Tweets Widget lists the traffic-related tweets which are collected by our system.
Figure 2: Snapshot of some System's frontend widgets.
5.References
-
-
-
Sentiment Analysis in Comments Associated to News Articles: Application to Al Jazeera Comments
Authors: Khalid Al-Kubaisi, Abdelaali Hassaine and Ali JaouaSentiment analysis is a very important research task that aims at understanding the general sentiment of a specific community or group of people. Sentiment analysis of Arabic content is still in its early development stages. In the scope of Islamic content mining, sentiment analysis helps understanding what topics Muslims around the world are discussing, which topics are trending and also which topics will be trending in the future.
This study has been conducted on a dataset of 5000 comments on news articles collected from Al Jazeera Arabic website. All articles were about the recent war against the Islamic State. The database has been annotated using Crowdflower which is website for crowdsourcing annotations of datasets. Users manually selected whether the sentiment associated with the comment was positive or negative or neutral. Each comment has been annotated by four different users and each annotation is associated with a confidence level between 0 and 1. The confidence level corresponds to whether the users who annotated the same comment agreed or not (1 corresponds to full agreement between the four annotators and 0 to full disagreement).
Our method represents the corpus by a binary relation between the set of comments (x) and the set of words (y). A relation exists between the comment (x) and the word (y) if, and only if, (x) contains (y). Three binary relations are created for comments associated with positive, negative and neutral sentiments. Our method then extracts keywords from the obtained binary relations using the hyper concept method [1]. This method decomposes the original relation into non-overlapping rectangles and highlights for each rectangle the most representative keyword. The output is a list of keywords sorted in a hierarchical ordering of importance. The obtained keyword list associated with positive, negative and neutral comments are fed into a random forest classifier of 1000 random trees in order to predict the sentiment associated with each comment of the test set.
Experiments have been conducted after splitting the database into 70% training and 30% testing subsets. Our method achieves a correct classification rate of 71% when considering annotations with all values of confidence and even 89% when only considering the annotation with a confidence value equal to 1. These results are very promising and testify of the relevance of the extracted keywords.
In conclusion, the hyper concept method extracts discriminative keywords which are used in order to successfully distinguish between comments containing positive, negative and neutral sentiments. Future work includes performing further experiments by using a varying threshold level for the confidence value. Moreover, by applying a part of speech tagger, it is planned to perform keyword extraction on words corresponding to specific grammatical roles (adjectives, verbs, nouns… etc.). Finally, it is also planned to test this method on publicly available datasets such as the Rotten Tomatoes Movie Reviews dataset [2].
Acknowledgment
This contribution was made possible by NPRP grant #06-1220-1-233 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
References
[1] A. Hassaine, S. Mecheter, and A. Jaoua. “Text Categorization Using Hyper Rectangular Keyword Extraction: Application to News Articles Classification.” Relational and Algebraic Methods in Computer Science. Springer International Publishing, 2015. 312–325.
[2] B. Pang and L. Lee. 2005. “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales”. In ACL, pages 115–124.
-
-
-
Flight Scheduling in the Airspace
Authors: Mohamed Kais Msakni, Mohamed Kharbeche, Mohammed Al-Salem and Abdelmagid HammudaThis paper addresses an important problem in the aircraft traffic management caused by the rapid growth of air traffic. The air route traffic control center has to deal with different plans of airlines in which they specify a requested entry time of their aircraft to the airspace. Each flight has to be assigned to a track and a level in order to ensure the Federal Aviation Administration (FAA) safety standards. When two flights are assigned to the same track and level, a minimum separation time has to be ensured. If this condition could not be satisfied, one of the flight will be delayed. This solution is undesirable for many reasons such as missing the connecting flight, decrease in the passengers' satisfaction, etc.
The problem of track-level scheduling can be defined as follows. Given a set of flights, each flight has to be assigned to one track and one level. To ensure the separation time between two flights assigned to the same track and level, it is possible to delay the requested departure time of a flight. The objective is to minimize the overall flight delay.
To deal with this problem, we propose a mixed integer programming formulation to find a flight plan that minimizes the objective function, while ensuring the FAA safety standards. In particular, this model considers an aircraft-dependent separation time: the separation time depends on the type of the aircraft assigned to the same track and level. However, some problems are too large to be solved in a reasonable time with the proposed model using a commercial solver. In this study, we developed a scatter search (SS) to deal with larger instances. SS is an evolutionary heuristic and the feature to be a problem-independent structure. This metaheuristic has been efficiently applied to a variety of optimization problems. Initially, SS starts with a set of solutions (reference set) that is constantly updated through two procedures (solution generation and combination) in the aim to produce high-quality solutions.
In order to assess the quality of the exact method and the scatter search, we carried out an experimental study on a set of instances that are generated from a real case data. This includes small (80 to 120 flights), medium (200 to 220 flights), and large (400 to 420 flights) instances. The mathematical model has been solved using CPLEX 12.6 and the scatter search has been coded using C language under Microsoft Visual Studio v12 environment. The tests were conducted under a Windows 7 machine with an Intel Core i7 and 8 GB of RAM. The model was tested on each instance with 1 hour time limit. The results show that no instances have been solved to optimality. For small instances, the model and the scatter search provide comparable results; however, for medium and large instances, scatter search gives the best results.
This conference was made possible by the UREP award [UREP 13 - 025 - 2 - 010] from the Qatar National Research Fund (a member of The Qatar Foundation).
-
-
-
Named Entity Disambiguation using Hierarchical Text Categorization
Authors: Abdelaali Hassaine, Jameela Al Otaibi and Ali JaouaNamed entity extraction is an important step in natural language processing. It aims at finding the entities which are present in text such as organizations, places or persons. Named entities extraction is of a paramount importance when it comes to automatic translation as different named entities are translated differently. Named entities are also very useful for advanced search engines which aim at searching for a detailed information regarding a specific entity. Named entity extraction is a difficult problem as it usually requires a disambiguation step as the same word might belong to different named entities depending on the context.
This work has been conducted on the ANERCorp named entities database. This Arabic database contains four different named entities: person, organization, location and miscellaneous. The database contains 6099 sentences, out of which 60% are used for training 20% for validation and 20% for testing.
Our method for named entity extraction contains two main steps: the first step predicts the list of named entities which are present at the sentence level. The second step predicts the named entity of each word of the sentence.
The prediction of the list of named entities at the sentence level is done through separating the document into sentences using punctuation marks. Subsequently, a binary relation between the set of sentences (x) and the set of words (y) is created from the obtained list of sentences. A relation exists between the sentence (x) and the word (y) if, and only if, (x) contains (y). A binary relation is created for each category of named entities (person, organization, location and miscellaneous). If a sentence contains several named entities, it is duplicated in the relation corresponding to each one of them. Our method then extracts keywords from the obtained binary relations using the hyper concept method [1]. This method decomposes the original relation into non-overlapping rectangles and highlights for each rectangle the most representative keyword. The output is a list of keywords sorted in a hierarchical ordering of importance. The obtained keyword list associated with each category of named entities are fed into a random forest classifier of 10000 random trees in order to predict the list of named entities associated with each sentence. The random forest classifier produces for each sentence the list of probabilities corresponding to the existence of each category of named entities within the sentence.
Random Forest [sentence(i)] = (P(Person),P(Organization),P(Location),P(miscellaneous)).
Subsequently, the sentence is associated with the named entities for which the corresponding probability is larger than a threshold set empirically on the validation set.
In the second step, we create a lookup table associating to each word in the database, the list of named entities to which it corresponds in the training set.
For unseen sentences of the test set, the list of named entities predicted at the sentence level is produced, and for each word, the list of predicted named entities is also produced using the lookup table previously built. Ultimately, for each word, the intersection between the two predicted lists of named entities (at the sentence and the word level) will give the final predicted named entity. In the case where more than one named entity is produced at this stage, the one with the maximum probability is kept.
We obtained an accuracy of 76.58% when only considering lookup tables of named entities produced at the word level. When performing the intersection with the list produced at the sentence level the accuracy reaches 77.96%.
In conclusion, the hierarchical named entity extraction leads to improved results over direct extraction. Future work includes the use of other linguist features and larger lookup table in order to improve the results. Validation on other state of the art databases is also considered.
Acknowledgements
This contribution was made possible by NPRP grant #06-1220-1-233 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
Reference
[1] A. Hassaine, S. Mecheter, and A. Jaoua. “Text Categorization Using Hyper Rectangular Keyword Extraction: Application to News Articles Classification”. Relational and Algebraic Methods in Computer Science. Springer International Publishing, 2015. 312–325.
-
-
-
SWIPT MIMO Relaying in with Spectrum Sharing Networks with Interference Cancellation
More LessSimultaneous wireless information and power transfer (SWIPT) is a promising solution to increase the lifetime of wireless nodes and hence alleviate the energy bottleneck of energy constrained wireless networks. To recent days, there are three different designs of SWIPT system which includes integrated SWIPT, closed-loop SWIPT, decoupled SWIPT. Integrated SWIPT is the simplest design where power and information are extracted by the mobile from the same modulated microwave transmitted by a base station (BS). For this scheme, the information transfer (IT) and power transfer (PT) distances are equal. For the closed loop scenario, it splits IT and PT between uplink and downlink wherein PT is in downlink and IT is for dedicated for uplink. The last one is to add additional special base station (BS) called a power beacon PB in which PT and IT are orthogonalized by using different frequency bands or time slots to avoid interference. Therefore, powering a cognitive radio networks through RF energy harvesting can be efficient in terms of spectrum usage and energy limits for wireless networking. The RF energy harvesting technique also is applicable in cooperative networks wherein an energy constrained relay with limited battery depends on external charging mechanism to assist the transmission of source information to the destination. In an effort to further improve spectrum sharing network performance, a number of works has suggested the idea of incorporating the multiple antenna technique into cognitive relaying. In particular, transmit antenna selection with receive maximal ratio combining (TAS/MRC) is adopted as a low complexity and power efficient approach which achieves full transmit/receive diversity.
Since the SUs and PUs share the same frequency band, there will be inevitably interference between the SUs and PUs. Therefore, reducing the effect of PU interference on the performance of secondary receiver is of significance important. Consequently, smart antennas can be employed to mitigate the PU interference. With knowledge of the direction of arrival (DoA), the receive radiation pattern can be shaped to place deep nulls in the directions of some of interfering signals. By doing so, two null-steering algorithms were proposed in the literature i.e., dominant interference reduction algorithm, and adaptive arbitrary interference reduction algorithm. The first algorithm requires perfect predication and statistical ordering of the interference signals instantaneous power, and the later algorithm does not need prior knowledge of the statistical properties of interfering signals. In this work, we limit our analysis to the dominant interference reduction algorithm.
In this work, we consider a dual-hop relaying with amplify-and-forward (AF) scheme where the source, relay, and the destination are equipped with multiple antennas. The relay node is experiencing co-channel interference. The purpose of array processing at the relay is to provide interference cancellation. Therefore, the energy constrained relay collects energy from ambient RF signals, cancel CCI, and then forward the information to the destination. In particular, we provide a comprehensive analysis for the system assuming selection at the source, and the destination. We derive the end-to-end exact and asymptotic outage probability for the proposed system model. A key parameters are also obtained featuring the diversity and coding gains.
-
-
-
Action Recognition in Spectator Crowds
Authors: Arif Mahmood and Nasir RajpootAction Recognition in Spectator Crowds
During the Football Association competitions held in 2013 in UK, 2,273 people were arrested due to the events of lawlessness and disorder, according to the statistics collected by the UK Home Office [1]. According to a survey on the major soccer stadium disasters around the world, more than 1500 people have died and more than 5000 are injured since 1902 to 2012 [2]. Therefore understanding spectator crowd behaviour is an important problem for public safety management and for the prevention of dangerous activities.
Computer Vision is the platform used by researchers for efficient crowd management research through video cameras. However most of the research efforts primarily show results on protest crowds or casual crowds while the spectator crowds have not been focussed. On the other hand the action recognition research has mostly addressed actions performed by one or two actors while the actions performed by individuals in the dense spectator crowds has not been addressed and is still an unsolved problem.
Action recognition in dense crowds pose very difficult challenges mostly due to the low resolution of subjects and significant variations in the action performance by the same individuals. Also different individuals perform the same action quite differently. Spatial distribution of performers varies with time. Scene contains multiple actions at the same time. Thus compared to the single actor action recognition, noise and outliers are significantly large and action start and stop are not well defined making action recognition very difficult.
In this work we target to recognize the actions performed by individuals in spectator crowds. For this purpose we consider a recently released dataset consisting of spectators in the 26th Winter Universiade held in Italy in 2013 [3]. Data was collected during the last four matches held in the same ice stadium using 5 cameras. Three high resolution cameras focussed on different parts of the spectator crowd with 1280 × 1024 pixel resolution and 30 fps temporal resolution. Figure 1 shows an example spectator crowd dataset image.
For action recognition in the spectator crowds, we purpose to compute dense trajectories in the crowd videos by using optical flow [4]. Trajectories are initiated on a dense grid and the starting points satisfy a quality measure based on KLT feature tracker (Fig. 2). Trajectories exhibiting motion lower than a minimum threshold are discarded. Along each trajectory shape and texture is encoded using Histograms of Oriented Gradients (HOG) features [5] and motion is encoded using Histogram of Flow (HOF) features [6]. The resulting feature vectors are grouped using the person bounding boxes provided in the dataset (Fig. 4). Note that person detectors which are especially designed for detection and segmentation of persons in dense crowds can also be used for this purpose [7].
All trajectories corresponding to a particular person are considered to encode the actions performed by that person. These trajectories are divided into overlapped temporal windows of width 30 frames (or 1.00 second time). Two consecutive windows has an overlap of 66%. Each person-time window is encoded using bag-of-words technique as explained below.
The S-HOCK dataset contains 15 videos of spectator crowds. For the purpose of training we use 10 videos and the remaining 5 videos are used for testing. From the training videos 100,000 trajectories are randomly sampled and grouped to 64 clusters using k-means algorithm. Each cluster center is considered as an item in the code-book. Each trajectory in a person-time group of trajectories is encoded using this code-book. This encoding is performed in the training as well as the test videos using bag-of-words approach. The code-book is considered as a part of the training process and saved.
For the purpose of bag-of-words encoding, distance of each trajectory in the person-time trajectory group is measured from all items in the code-book. Here we follow two approaches. In the first approach, only one vote is casted at the index corresponding to the best matching code-book item. In the second approach, 5 votes are casted corresponding to the 5 best matching code-book items. These votes are given weights inversely proportional to the distance of trajectory from each of the five best matching code-book items.
In our experiments we observe better action recognition performance of the multi-voting strategy compared to the single weight scheme. It is because more information is captures in the multi-voting strategy. In the SHOCH dataset, each person is manually labelled as performing one of the 23 actions, including the ‘other’ action which covers all actions not included in the first 22 categories (Fig. 3). Each person-time group of trajectories is given an action label from the dataset. Once this group is encoded using code-book, it becomes a single vector histogram. Each of these vectors is given the same action label depending upon the label assigned to the corresponding person-time trajectory group.
The labelled vectors obtained from the training dataset are used to train both linear and kernel SVM using one verses all strategy. The labels of the vectors in the test data are used as ground truth and the learned SVM are used to predict the label of each test vector independently. The predicted labels are then compared with the ground truth labels to establish action recognition accuracy. We observe an accuracy increase of 3% to 4% when SVM with Gaussian RBF was used. Results are shown in Table 1 and precision recall curves are shown in Figs. 5 & 6.
In our experiments we observe that applauding and shaking flag actions have obtained more accuracy compared with other actions in the dataset (Table 1). It is mainly because of the fact that these actions have higher frequency and consist of significant discriminative motion. While other actions have low frequency of occurrence and also in some actions the motion is not discriminative. For example in using device action, when someone in the crowd use a mobile phone or a camera, the motion based detection is not very efficient.
References
[1]Home Office and The Rt Hon Mike Penning MP, “Football-related arrests and banning orders, season 2013 to 2014”, published 11 September 2014.
[2]Associated Press, “Major Soccer Stadium Disasters”, The Wall Street Journal (World), published 1 February 2012.
[3]Conigliaro, Davide, et al. “The SHock Dataset: Analyzing Crowds at the Stadium.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[4]Wang, Heng, et al. “Action recognition by dense trajectories.” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
[5]Dalal, Navneet, and Bill Triggs. “Histograms of oriented gradients for human detection.” Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.
[6]Dalal, Navneet, Bill Triggs, and Cordelia Schmid. “Human detection using oriented histograms of flow and appearance.” Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 428–441.
[7]Idrees, Haroon, Khurram Soomro, and Mubarak Shah. “Detecting Humans in Dense Crowds using Locally-Consistent Scale Prior and Global Occlusion Reasoning.” IEEE TPAMI 2015.
-
-
-
Plasmonic Modulator Based on Fano Resonance
The field of plasmonics is continuously attracting research in the area of integrated photonics development, to highly integrate photonic components, devices and detectors in a single photonic chip, just as electronic chips containing many electronic components. The interesting properties that plasmonics offer include the electromagnetic fields enhancement and the confinement of the propagating surface plasmon polaritons to sub-100 nm size features at metal dielectric interfaces. Thereby, the field of plasmonics is very promising for minimizing the photonic components to smaller sizes that cannot be experienced using conventional optics and in particular the Silicon photonics industry. Many applications based on plasmonics are being increasingly developed and studied such as electromagnetic field enhancement for surface enhanced spectroscopy, wave guiding, sensing, modulation, switching, and photovoltaic applications.
We hereby propose a novel compact plasmonic resonator that can be utilized for different applications that depend on optical resonance phenomena in the Near Infrared spectral range, a very interesting range for a variety of applications, including sensing and modulation. The resonator structure consists of a gold layer which is etched to form a metal-insulator-metal wave guide and a rectangular cavity. The rectangular cavity and the wave guide are initially treated as a dielectric material. The strong reflectivity of gold at frequencies higher than the plasma frequency is the origin of the Fabry Perot resonator behavior of the rectangular cavity. The fano resonance was produced successfully and controlled by varying the rectangular cavity dimensions. The fano profile is generated in as the result of the redistribution of the electromagnetic field in the rectangular cavity as depicted by the plasmonic mode distribution in the resonator. The fano resonance is characterized by its sharp spectral line which attracts applications requiring sharp spectral line shapes such as sensing and modulation applications.
Optical modulators are key components in the modern communication technology. The research trend on optical modulators aims at achieving compact designs, low power consumption and large bandwidth operation. Plasmonic modulators emerge as promising devices since they can have high modulation speeds and very compact designs.
The operation mechanism of our introduced plasmonic modulator is as follows: instead of a constant refractive index dielectric, an electro-optic polymer that has its refractive index is dependent on some controlled phenomena, is filled in the metal insulator metal waveguide and the rectangular cavity. Efficient modulation was achieved by changing the applied voltage (DC Signal) on the metal contacts which then changes the refractive index of the polymer, thereby shifting the resonant wavelength position in the resonator, leading to signal modulation. Our modulator is operational at the telecom wavelength 1.55 μm, thereby suitable for the modern communication technology.
Finite Difference Time Domain (FDTD) simulations were conducted to design the modulator structure and run the simulations experiments, and to study the resonance effects of the structure and optimize its response to the desired results, the most important results however are the efficient modulation of the optical energy at the wavelengths required in the modern communication technology, around 1.5 μm, all results were carried on using the commercially available Lumerical FDTD software.
-
-
-
Conceptual-based Functional Dependency Detection Framework
By Fahad IslamNowadays, knowledge discovery from data is one of the challenging problems, due to its importance in different fields such as; biology, economy and social sciences. One way of extracting knowledge from data can be achieved by discovering functional dependencies (FDs). FD explores the relation between different attributes, so that the value of one or more attributes is determined by another attribute set [1]. FD discovery helps in many applications, such as; query optimization, data normalization, interface restructuring, and data cleaning. A plethora of functional dependency discovery algorithms has been proposed. Some of the most widely used algorithms are; TANE [2], FD_MINE [3], FUN [4], DFD [5], DEP-MINER [6], FASTFDS [7] and FDEP [8]. These algorithms extract FDs using different techniques, such as; (1) building a search space of all attributes combinations in an ordered manner, then start searching for candidate attributes that are assumed to have functional dependency between them, (2) generating agreeing and difference sets, where the agreeing sets are acquired through applying cross product of all tuples, the difference sets are the complement of the agreeing sets, both sets are used to infer the dependencies, (3) generating one generic set of functional dependency, in which each attribute can determine all other attributes, this set is then updated and some dependencies are removed to include more specialized dependencies through records pairwise comparisons.
Huge efforts have been dedicated to compare the most widely used algorithms in terms of runtime and memory consumption. No attention has been paid to the accuracy of resultant set of functional dependencies represented. Functional dependency accuracy is defined by two main factors; being complete and minimal.
In this paper, we are proposing a conceptual-based functional dependency detection framework. The proposed method is mainly based on Formal Concept Analysis (FCA); which is a mathematical framework rooted in lattice theory and is used for conceptual data analysis where data is represented in the form of a binary relation called a formal context [9]. From this formal context, a set of implications is extracted, these implications are in the same form of FDs. Implications are proven to be semantically equivalent to the set of all functional dependencies available in the certain database [10]. This set of implications should be the smallest set representing the formal context which is termed the Duquenne–Guigues, or canonical, basis of implications [11]. Moreover, completeness of implications is achieved through applying Armstrong rules discussed in [12].
The proposed framework is composed of three main components; they are:
Data transformation component: it converts input data to binary formal context.
Reduction component: it applies data reduction on tuples or attributes.
Implication extraction component: this is responsible for producing minimal and complete set of implications.
The key benefits of the proposed framework:
1 It works on any kind of input data (qualitative and quantitative) that is automatically transformed to a formal context of binary relation,
2 A crisp Lukasiewicz data reduction technique is implemented to remove redundant data, which positively helps reducing the total runtime,
3 The set of implications produced are guaranteed to be minimal; due to the use of Duquenne–Guigues algorithm in extraction,
The set of implications produced are guaranteed to be complete; due to the use of Armstrong rules.
The proposed framework is compared to the seven most commonly used algorithms listed above and evaluated based on runtime, memory consumption and accuracy using benchmark datasets.
Acknowledgement
This contribution was made possible by NPRP-07-794-1-145 grant from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
-
-
-
An Arabic Text-to-Picture Mobile Learning System
Authors: AbdelGhani Karkar, Jihad Al Ja'am and Sebti FoufouHandled devices and software applications are susceptible to ameliorate learning strength, awareness, and career development. Many mobile-based learning applications are obtainable from the market but Arabic learning shortage is not taken in consideration. We conduct an Arabic Text-to-Picture (TTP) mobile educational application which performs knowledge extraction and concept analysis to generate pictures that represent the content of the Arabic text. The knowledge extraction is based on Arabic semantic models cover important scopes for young children and new Arabic learners (i.e., grammar, nature, animals). The concept analysis uses semantic reasoning, semantic rules, and Arabic natural text processing (NLP) tool to identify word-to-word relationships. The retrieval of images is done spontaneously from local repository and online search engine (i.e., Google or Bing). The instructor can select the Arabic educational content, get semi-automatic generated pictures, and use them for explanation. Preliminary results show improvement in Arabic learning strength and memorization.
Keywords
Mobile Learning, Natural Language Processing, Ontology, Multimedia Learning, Engineering Education.
I. Introduction
Nowadays, mobile learning environment has been excessively used in diverse fields and has become a common matter in educational movement. In such an environment, learners are able to reach online educational materials from any location. Learners of Arabic language suffer from the lack of adequate resources. In fact, most of the educational software, tools, and web sites use classical techniques of introducing the concepts and explaining the vocabulary. We present in this paper a text-to-picture (TTP) educational mobile system that promotes Arabic children stories through semi-automatic generated pictures to illustrate their contents in an attractive manner. Preliminary results show that the system enhances the Arabic learners' comprehension, deduction and realization.
II. Background
Natural language processing (NLP) stresses the extraction of useful information and mining natural text. These information can be used to identify the scope of the text in order to generate summaries, classify contents and teach vocabulary. Diverse NLP-based systems that illustrate the text to images have been developed recently [1, 2, 3]. In general, these systems divide the text into segments and single words, access local multimedia resources, or explore the web to get pictures and images to illustrate the content.
All the proposed systems and techniques do not include the Arabic language. In this paper, we propose an Arabic TTP educational system using multimedia technology to teach children in an attractive way. Our proposal generates the multimedia tutorials dynamically by using Arabic text processing, entities relationship extraction, multimedia ontology, and online extraction of multimedia contents fetched from Google search engine.
III. Methodology:
In order to develop our system, we have created first the general system artwork, set the end user graphical user interface, design the semantic model that will store all semantic information about terms, and collect educational stories and analyze them. We have gathered 30 educational stories, annotated terms, and associated illustrations manually. Illustrations were gathered from the Internet and educational repository. The semantic model is developed using “Protégé editor”, a free open source ontology editor developed by Stanford [4]. The semantic model is composed from many classes that are referred to as concepts.
IV. The proposed system
The proposed system is a client-server application. When the server launched, it loads its packages and components, it loads the defined ontology, text parser components, and finally it opens a connection to listen for users' requests. Upon an effective connection trial, the user will be eligible to enter or open existing Arabic stories and process them. On the client side, the processing request and response of the story is done in a different thread, to keep the user able to continue his work without any interruption. Finally, server reply will be displayed for the user on his mobile device which consists from the processed Arabic story, related images, and different questions about an animal.
V. Conclusion
This study presents a complete system that automatically generates illustrations for Arabic stories through text processing, Arabic ontology, relationship extraction, and illustration generation. The proposed system belongs to learning technology which can be on mobile devices to teach children in an attractive and non-traditional style. Preliminary results demonstrate that the system improved learners' comprehension and realization.
References
[1] Bui, Duy, Carlos Nakamura, Bruce E Bray, and Qing Zeng-Treitler, “Automated illustration of patients instructions,” in AMIA Annual Symposium Proceedings, vol. 2012, pp. 1158, 2012.
[2] Li, Cheng-Te, Chieh-Jen Huang, and Man-Kwan Shan, “Automatic generation of visual story for fairy tales with digital narrative,” in Web Intelligence, vol. 13, pp. 115–122, 2015.
[3] Ustalov, Dmitry and R Kudryavtsev, “An Ontology Based Approach to Text to Picture Synthesis Systems,” in Proceedings of the 2nd International Workshop on Concept Discovery in Unstructured Data (CDUD 2012), 2012.
[4] Protégé. Ontology Editor Software. Available from: http://protege.stanford.edu, Accessed: September 2015.
-
-
-
Discovering the Truth on the Web Data: One Facet of Data Forensics
Authors: Mouhamadou Lamine Ba, Laure Berti-Equille and Hossam M. HammadyData Forensics with Analytics, or DAFNA for short, is an ambitious project initiated by the Data Analytics Research Group in Qatar Computing Research Institute, Hamad Bin Khalifa University. It main goal is to provide effective algorithms and tools for determining the veracity of structured information when they originate from multiple sources. The ability to efficiently estimate the veracity of data, along with the reliability level of the information sources, is a challenging problem with many real-world use cases (e.g., data fusion, social data analytics, rumour detection, etc.) in which users rely on a semi-automated data extraction and integration process in order to consume high quality information for personal or business purposes. DAFNA's vision is to provide a suite of tools for Data Forensics and investigate various research topics such as fact-checking and truth discovery and their practical applicability. We will present our ongoing development (dafna.qcri.org) on extensively comparing the state-of-the-art truth discovery algorithms, releasing a new system and the first REST API for truth discovery, and designing a novel hybrid truth discovery approach using active ensembling. Finally, we will briefly present real-world applications of truth discovery from Web data.
Efficient Truth Discovery. Truth discovery is a hard problem to deal with since there is no a priori knowledge about the veracity of provided information and the reliability level of online sources. This raises many questions about a thorough understanding of the state-of-the-art truth discovery algorithms and their applicability for actionable truth discovery. A new truth discovery approach is needed and it should be rather comprehensible and domain-independent. In addition, it should take advantage of the benefits of existing solutions, while being built on realistic assumptions for an easy use in real-world applications. In this context, we propose an approach that deals with open truth discovery challenges and consists of the following contributions: (i) The thorough comparative study of existing truth discovery algorithms; (ii) The design and release of the first online truth discovery system and the first REST API for truth discovery available at dafna.qcri.org; (iii) An hybrid truth discovery method using active ensembling; and (iv) An application to query answering related to Qatar where the veracity of information provided by multiple Web sources is estimated.
-
-
-
Identifying Virality Attributes of Arabic Language News Articles
Authors: Sejeong Kwon, Sofiane Abbar and Bernard J. JansenOur research is focused on expanding the reach and impact of Arabic language news articles by attracting more readers. In pursuit of this research goal, we analyze attributes that result in certain news articles becoming viral, relative to other news articles that do not become viral or so viral. Specifically, we focus on Arabic language news articles, as Arabic language articles have unique linguistic, cultural, and social constrains relative to most Western languages news stories. In order to understand virality, we take two approaches, a time series and linguistical, in an Arabic language data of more than 1,000 news articles with associated temporal traffic data. For data collection, we select (Kasra, “a breaking”) (http://kasra.co/) is an Arabic language online news site that targets Arabic language speakers worldwide, but particularly in the Middle East North Africa (MENA) region. We gathered more than 3,000 articles, originally, then gathered traffic data for this set of articles, reducing the set to more than 1,000 with complete traffic data. We focus first on the temporal attributes in order to categorize clusters of virality with this set of articles. Then, with topical analysis, we seek to identify linguistical aspects common to articles within each virality cluster identified by time series. Based on results from the time series analysis, we cluster articles based on common temporal characteristics of traffic access. Once clustered by time series, we analyze each cluster for content attributes, topical and linguistical, in order to identify specific attributes that may be causing the virality of articles within each times-series cluster. To compute dissimilarity for time-series, we utilize and evaluate the performance of several state-of-the-art time series dissimilarity-based clustering approaches, such as dynamic time warping, discrete wavelet transformation, and others. To identify the dissimilarity algorithm with the most discriminating power, we conduct a principal component analysis (PCA), which is a statistical technique used to highlight variations and patterns in a dataset. Based on findings from our PCA, we select discrete wavelet transformation-based dissimilarity as the best times-series algorithm for our research because the resulting principal axes explain more proportion of variability (75.43 percent) relative to the other time-series algorithms that we had employed. We identify five virality clusters using times series. For topic modeling, we employ Latent Dirichlet allocation (LDA) for this portion of the research. LDA is a generative probabilistic model for collections of discrete data, such as text, LDA explains similarities among groups of observations within a data set. For text modeling, the topic probabilities of LDA provide an explicit representation of a document. For the topical classification analysis, we use Linguistic Inquiry and Word Count (LIWC), which is a sentiment analysis tool. LIWC is a text processing program based on occurrences of words in several categories covering writing style and psychological meaning. Prior empirical work shows the value of a LIWC linguistic analysis for detecting meanings in various experimental settings, including attention focus, thinking style, and social relationships. In terms of results, surprising, the article topic is not predictive of virality of Arabic language news articles. Instead we find that linguistical aspects and style of the news article is the most predictive attribute for predicting virality for Arabic news articles. In analyzing the attributes of virality in Arabic language news articles, our research finds that, perhaps counter intuitively, the topic of the article does not impact the virality. Instead, we find that style of the article is the most impactful attribute for predicting virality for Arabic news articles. Building on these findings, we will leverage aspects of the news articles with other factors to develop tools to assist content creators to more effectively reach their user segment. Our research results will assist in understanding the virality of Arabic news and ultimately improve readership and dissemination of Arabic language news articles.
-
-
-
Efforts Towards Automatically Generating Personas in Real-time Using Actual User Data
Authors: Bernard J. Jansen, Jisun An, Haewoon Kwak and Hoyoun ChoThe use of personas is an interactive design technique with considerable potential for product and content development. A persona is a representation of a group or segment of users, sharing common behavioral characteristics. Although representing a segment of users, a persona is generally developed in the form of a detailed narrative about an explicit but fictitious individual that represents the collection of users possessing similar behaviors or characteristics. In order to make the fictitious individual appear as real person to the product developers, the persona narrative usually contains a variety of both demographic and behavioral details about socio economic status, gender, hobbies, family members, friends, possessions, among many other data. Also, the narrative of a persona normally also addresses the goals, needs, wants, frustrations and other emotional aspects of the fictitious individual that are pertinent to the product being designed. However, personas have typically been viewed as fairly static. In this research, we demonstrate an approach for creating and validating personas in real time, based on automated analysis of actual user data. Our data collection site and research partner is AJ+ (http://ajplus.net/), which is a news channel from Al Jazeera Media Network that is natively digital with a presence only on social media platforms and a mobile application. Its media concept is unique in that AJ+ was designed from the ground up to serve news in the medium of viewer, versus a teaser in one medium with a redirect to a website. In pursuit of our overall research objective of automatically generating personas in real time, for research reported in this manuscript, we are specifically interested in understanding the AJ+ audience by identifying (1) whom are they reaching (i.e., market segment) and (2) what competitive (i.e., non-AJ+) content are associated with each market segment. Focusing on one aspect of user behavior, we collect 8,065,350 instances of sharing of links by 54,892 users of an online news channel, specifically examining the domains these users share. We then cluster users based on similarity of domains shared, identifying seven personas based on this behavioral aspect. We conduct term-frequency – inverse document frequency (tf-idf) vectorization. We remove outliers of less than 5 shares (too unique) and more than 80% of the all users' shares (too popular). We use K-means++ clustering (K = 2.. 10), which is an advanced version of K-means to improve selection of initial seeds, because K-means++ effectively works for a very sparse matrix (user-link). We use the “elbow” method to choose the optimal number of clusters, which is eight in this case. In order to characterize each cluster, we list top 100 domains from each cluster and discover that there are large overlaps among clusters. We then remove from each cluster the domains that existed in another cluster in order to identify the relevant, unique, and impactful domains. This de-duplication results in the elimination of one cluster, leaving us with a set of clusters, where each cluster is characterized by domains that are shared only by users within that cluster. We note that the K-means++ clustering method can be replaced easily with other clustering methods in various situations. Demonstrating that these insights can be used to develop personas in real-time, the research results provide insights into competitive marketing, topic interests, and preferred system features for the users of the online news medium. Using the description of each of shared links, we detect their languages. 55.2% (30,294) users share links in one just language and 44.8% users share links in multiple languages. The most frequently used language is English (31.98%), followed by German (5.69%), Spanish (5.02%), French (4.75%), Italian (3.46%), Indonesian (2.99%), Portuguese (2.94%), Dutch (2.94%), Tagalog1 (2.71%), and Afrikaans (2.69%). As there were millions of domains shared, we utilize the top one hundred domains for each cluster, resulting in 700 top domains shared by the 54,892 AJ+ users. We, as mentioned, de-duplicated, resulting in the elimination of a cluster (11,011 users, 20.06%). So, we have seven unique clusters based on sharing of domains representing 43,881 users. We then demonstrate how these findings can be leveraged to generate real-time personas based on actual user data. We stream the data analyze results into a relational database, combine the results with other demographic data that we gleaned from available sources such as Facebook and other social media accounts, using each of the seven clusters as representative of a persona. We give each persona a fictional name and use a stock photo as the face of our personas. Each persona was linked to the top alternate (i.e., non-AJ+) domains they most commonly shared with the personas shared links updateable with new data. Research implications are that personas can be generated in real-time, instead of being the result of a laborious, time-consuming development process.
-
-
-
Creating Instructional Materials with Sign Language Graphics Through Technology
Authors: Abdelhadi Soudi and Corinne VinopolEducation of deaf children in the developing world is very dire and there is a dearth of sign language interpreters to assist them with translation for sign language-dependent students in the classrooms. Illiteracy within the deaf population is rampant. Over the past several years, a unique team of Moroccan and American deaf and hearing researchers have united to enhance the literacy of deaf students by creating tools that incorporate Moroccan Sign Language (MSL) and American Sign Language (ASL) under funding grants from USAID and the National Science Foundation (NSF). MSL is a gestural language distinct from both the spoken languages and written language of Morocco and has no text representation. Accordingly, translation is quite challenging and requires representation of MSL in graphics and video.
Many deaf and hard of hearing people do not have good facility with their native spoken language because they have no physiological access to it. Because oral languages depend, to a great extent, upon phonology, reading achievement of deaf children usually falls far short of that of hearing children of comparable abilities. And, by extension, reading instructional techniques that rely on phonological awareness, letter/sound relationships, and decoding, all skills proven essential for reading achievement, have no sensory relevance. Even in the USA, where statistics are available and education of the deaf is well advanced, on average, deaf high school graduates have a fourth grade reading level; only 7–10% of deaf students read beyond a seventh to eighth grade reading level; and approximately 20% of deaf students leave school with a second grade or below reading level (Gallaudet University's national achievement testing programs (1974, 1983, 1990, and 1996); Durnford, 2001; Braden, 1992; King & Quigley, 1985; Luckner, Sebald, Cooney, Young III, & Muir, 2006; Strong, & Prinz, 1997).
Because of spoken language inaccessibility, many deaf people rely on a sign language. Sign language is a visual/gestural language that is distinct from spoken Moroccan Arabic and Modern Standard/written Arabic and has no text representation. It can only be depicted via graphics, video, and animation.
In this presentation, we present an innovative technology Clip and Create, a tool for automatic creation of sign language supported instructional material. The technology has two tools– Custom Publishing and Instructional Activities Templates, and the following capabilities:
(1)Automatically constructs customizable publishing formats;
(2)Allows users to import Sign Language clip art and other graphics;
(3)Allows users to draw free-hand orusere-sizable shapes;
(4)Allows users to incorporate text, numbers, and scientific symbols in various sizes, fonts, and colors;
(5)Saves and prints published products;
(6)Focuses on core vocabulary, idioms, and STEM content;
(7)Incorporates interpretation of STEM symbols into ASL/MSL;
(8)Generates customizable and printable Instructional Activities that reinforce vocabulary and concepts found in instructional content using Templates:
a. Sign language BINGO cards,
b. Crossword puzzles,
c. Finger spelling/spelling scrambles,
d. Word searches (in finger spelling and text),
e. Flashcards (with sign, text, and concept graphic options), and
f. Matching games (i.e., Standard Arabic-to-MSL and English-to-ASL).
(cf. Figure 1: Screenshots from Clip and Create)
The ability of this tool to efficiently create bilingual (i.e., MSL and written Arabic and ASL and English) educational materials will have a profound positive impact on the quantity and quality of sign-supported curricular materials teachers and parents are able to create for young deaf students. And, as a consequence, deaf children will show improved vocabulary recognition, reading fluency, and comprehension.
A unique aspect of this software is that written Arabic is used by many Arab countries even though the spoken language varies. Though there are variations in signs as well, there is enough consistency to make this product useful in other Arab-speaking nations as is. Any signing differences can easily be adjusted by swapping sign graphic images.
-
-
-
A Distributed and Adaptive Graph Simulation System
Authors: Pooja Nilangekar and Mohammad HammoudLarge-scale graph processing is becoming central to our modern life. For instance, graph pattern matching (GPM) can be utilized to search and analyze social graphs, biological data and road networks, to mention a few. Conceptually, a GPM algorithm is typically defined in terms of subgraph isomorphism, whereby it seeks to find subgraphs in an input data graph, G, which are similar to a given query graph, Q. Although subgraph isomorphism forms a uniquely important class of graph queries, it is NP-complete and very restrictive in capturing sensible matches for emerging applications like software plagiarism detection, protein interaction networks, and intelligence analysis, among others. Consequently, GPM has been recently relaxed and defined in terms of graph simulation. As opposed to subgraph isomorphism, graph simulation can run in quadratic time, return more intuitive matches, and scale well with modern big graphs (i.e., graphs with billions of vertices and edges). Nonetheless, the current state-of-the-art distributed graph simulation systems still rely on graph partitioning (which is also NP-complete), induce significant communication overhead between worker machines to resolve local matches, and fail to adapt to various complexities of query graphs.
In this work, we observe that big graphs are not big data. That is, the largest big graph that we know of can still fit on a single physical or virtual disk (e.g., 6TB physical disks are cheaply available nowadays and AWS EC2 instances can offer up to 24 × 2048GB virtual disks). However, since graph simulation requires exploring the entire input big graph, G, and naturally lacks data locality, existing memory capacities can get significantly dwarfed by G's size. As such, we propose GraphSim, a novel distributed and adaptive system for efficient and scalable graph simulation. GraphSim precludes graph partitioning altogether, yet still exploits parallel processing across cluster machines. In particular, GraphSim stores G at each machine but only matches an interval of G's vertices at the machine. All machines are run in parallel and each machine simulates its interval locally. Nevertheless, if necessary, a machine can inspect remaining dependent vertices in G to fully resolve its local matches without communicating with any other machine. Hence, GraphSim does not shuffle intermediate data whatsoever. In addition, it attempts not to overwhelm the memory of any machine via employing a mathematical model to predict the best number of machines for any given query graph, Q, based on Q's complexity, G's size and the memory capacity of each machine. Subsequently, GraphSim renders adaptive as well. We experimentally verified the efficiency and the scalability of GraphSim over private and public clouds using real-life and synthetic big graphs. Results show that GraphSim can outperform the current fastest distributed graph simulation system by several orders of magnitude.
-
-
-
A BCI m-Learning System
Authors: AbdelGhani Karkar and Amr MohamedMobile learning can help in evolving students learning strength and comprehension skills. A connection is required to enable such devices communicate between each other. Brain-Computer Interface (BCI) can read brain signals and transform them into readable information. For instance, an instructor can use such device to track interest, stress level, and engagement of his students. We propose in this paper a mobile learning system that can transpose text-to-picture (TTP) to illustrate the content of Arabic stories and synchronize information with connected devices in a private wireless mesh network. The device of the instructor can connect the internet to download further illustrative information. It shares Wireless and Bluetooth connection with at least one student. Therefore, students can share the connection between each other to synchronize information on their devices. BCI devices are used to navigate, answer questions, and get to track students' performance. The aim of our educational system is to establish a private wireless mesh network that can operate in a dynamic environment.
Keywords
Mobile Learning, Arabic Text Processing, Brain Computer Interface, Engineering Education, Wireless Mesh Network.
I. Introduction
Nowadays mobile devices and collaborative work opened a new horizon for collaborative learning. As most people own handheld private portable smart phones, this has become the main means of connectivity and communication between people. Using smart phones for learning is beneficial and more attractive as learners can access educational resources at any time. Different eLearning systems available provide different options for collaborative classroom environment. However, they do not consider effective needs to adapt learning performance. They do not provide dynamic communication, automatic feedback, and other required classroom events.
II. Backgrounds
There are several collaborative learning applications have been proposed. BSCW [1] enables for online sharing of workspaces between distant people. Lotus Sametime Connect [2] also provides services for collaborative multiway chat, web conferencing, location awareness, and so. Saad et al. [3] proposed an intelligent collaborative system that can enable small range of mobile devices to communicate using WLAN, and uses the Bluetooth in case of power outage. The architecture of the proposed system is central where client can connect the server. Saleem [4] proposed a Bluetooth Assessment System (BAS) to use Bluetooth as an alternative to transfer questions and answers between the instructor and students. As many systems have been proposed in the domain of collaborative learning, several of them support mobile technology while others do not. BCI is not considered as part of the mobile educational system that can surround the environment with reading mental signals.
III. The proposed system
Our proposed system provides educational content and and synchronize it in a Wireless and Bluetooth wireless mesh network. It can be used in classrooms independent from the public network. The primary device broadcasts messages to enable users follow the explanation of the instructor on their mobile devices. The proposed system covers: 1) the establishment of wireless mesh network between mobile devices, 2) reading BCI data, 3) message communications, and 4) performance analysis between Wireless and Bluetooth technologies through device-to-device communication.
References
[1] K. Klö, “BSCW: cooperation support for distributed workgroups”, Parallel, Distributed and Network-based Processing. In 10th Euromicro Workshop, pp. 277–282, 2002.
[2] Lotus Sametime Connect. (2011, Feb. 17). Available: http://www.lotus.com/sametime.
[3] T. Saad, A. Waqas, K. Mudassir, A. Naeem, M. Aslam, S. Ayesha, A. Martinez-Enriquez, and P.R. Hedz, “Collaborative work in class rooms with handheld devices using bluetooth and WLAN,” IEEE 27th Canadian Conference on Electrical and Computer Engineering (CCECE), 2014, pp. 1–6.
[4] N.H., Saleem, “Applying Bluetooth as Novel Approach for Assessing Student Learning,” Asian Journal of Natural & Applied Sciences, vol 4, no. 2, 2015.
-
-
-
Qurb: Qatar Urban Analytics
Doha is one of the fastest growing cities of the world with a population that has increased by nearly 40% in the last five years. There are two significant trends that are relevant to our proposal. First, the government of Qatar is actively engaged in embracing the use of fine-grained data to “sense” the city for maintaining current services and future planning to ensure a high standard of living for its residents. In this line, QCRI has initiated several research projects related to urban computing to better understand and predict traffic mobility patterns in the city of Doha [1]. Second trend is the high degree of social media participation of the populace, providing a significant amount of time-oriented social sensing of the all types of events unfolding in the city. A key element of our vision is to integrate data from physical and social sensing, into what we call socio-physical sensing. Another key element of our vision is to develop novel analytics approaches to mine this cross-modal data to make various applications for residents smarter than they could be with a single mode of data. The overall goal is to help citizens in their every-day life in urban spaces, and also help transportation experts and policy specialists to take a real time data-driven approach towards urban planning and real time traffic planning in the city.
Fast growing cities like Doha encounter several problems and challenges that should be addressed in time to ensure a reasonable quality of life for its population. These challenges encompass good transportation networks, sustainable energy sources, acceptable commute times, etc. and go beyond physical data acquisition and analytics.
In the era of Internet of Things [5], it has become commonplace to deploy static and mobile physical sensors around the city in order to capture indicators about people's behaviour related to driving, polluting, energy consumption, etc. The data collected from physical as well as social sensors has to be processed using advanced exploratory data analysis, cleaned and consolidated to remove inconsistent, outlying and duplicate records before statistical analysis, data mining and predictive modeling can be applied.
Recent advances in social computing have enabled scientists to study and model different social phenomena using user generated content shared on social media platforms. Such studies include the spread of diseases on social media [3] and studying food consumption in Twitter [4]. We envision a three layered setting: the ground, physical sensing layer, and social sensing layer. The ground represents the actual world (e.g., a city) with its inherent complexity and set of challenges. We aim at solving some of these problems by combining two data overlays to better model the interactions between the city and its population.
QCRI vision is twofold:
From a data science perspective: Our goal is to take a holistic cross-modality view of urban data acquired from disparate urban/social sensors in order to (i) design an integrated data pipeline to store, process and consume heterogeneous urban data, and (ii) develop machine learning tools for cross-modality data mining which aids decision making for the smooth functioning of urban services;
From a social informatics perspective: Use social data generated by users and shared via social media platforms to enhance smart city applications. This could be achieved by adding a semantic overlay to data acquired through physical sensors. We believe that combining data from physical sensors with user generated content potentially leads to the design of better and smarter lifestyle applications such as “evening out experience” recommenders that optimize for the whole experience including driving, parking and restaurant quality; Cab finder that takes into account the current traffic status, etc.
Figure 1. Overview of Proposed Approach.
In Fig. 1 we provide a general overview of our cross-modality vision. While most of the effort toward building applications assisting people in their everyday life has focused on only one data overlay, we claim that combining the two overlays of data could generate a significant added value to applications on both sides.
References
[1] Chawla, S., Sarkar, S., Borge-Holthoefer, J., Ahamed, S., Hammady, H., Filali, F., Znaidi, W., “On Inferring the Time-Varying Traffic Connectivity Structures of an Urban Environment”, Proc. of the 4th International Workshop on Urban Computing (UrbComp 2015) in conjunction with KDD 2015, Sydney, Australia.
[2] Sagl, G., Resch, B., Blaschke, T., “Contextual Sensing: Integrating Contextual Information with Human and Technical Geo-Sensor Information for Smart Cities”. Sensors 2015, 15, 17013–17035.
[3] Sadilek, A., Kautz, H. A., Silenzio, V. “Modeling Spread of Disease from Social Interactions.” ICWSM. 2012.
[4] Sofiane Abbar, Yelena Mejova, and Ingmar Weber. 2015. You Tweet What You Eat: Studying Food Consumption Through Twitter. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ‘15). ACM, New York, NY, USA, 3197–3206.
[5] Atzori, L., Iera, A., Morabito, G. “The internet of things: A survey.” Computer networks 54.15 (2010): 2787–2805.
-
-
-
Detecting Chronic Kidney Disease Using Machine Learning
Authors: Manoj Reddy and John ChoMotivation Chronic kidney disease (CKD) refers to the loss of kidney functions over time which is primarily to filter blood. Based on its severity it can be classified into various stages with the later ones requiring regular dialysis or kidney transplant. Chronic kidney disease mostly affects patients suffering from the complications of diabetes or high blood pressure and hinders their ability to carry out day-to-day activities. In Qatar, due to the rapidly changing lifestyle there has been an increase in the number of patients suffering from CKD. According to Hamad Medical Corporation [2], about 13% of Qatar's population suffers from CKD, whereas the global prevalence is estimated to be around 8–16% [3]. CKD can be detected at an early stage and can help at-risk patients from a complete kidney failure by simple tests that involve measuring blood pressure, serum creatinine and urine albumin [1]. Our goal is to use machine learning techniques and build a classification model that can predict if an individual has CKD based on various parameters that measure health related metrics such as age, blood pressure, specific gravity etc. By doing so, we shall be able to understand the different signals that identify if a patient at risk of CKD and help them by referring to preventive measures.
Dataset Our dataset was obtained from the UCI Machine Learning repository, which contains about 400 individuals of which 250 had CKD and 150 did not. The dataset was obtained from a hospital in southern India over a period of two months. In total there are 24 fields, of which 11 are numeric and 13 are nominal i.e. can take on only one of many categorical values. Some of the numerical fields include: blood pressure, random blood glucose level, serum creatinine level, sodium and potassium in mEq/L. Similarly, examples of nominal fields are answers to yes/no type questions such as whether the patient suffers from hypertension, diabetes mellitus, coronary artery disease. There was missing data values in a few rows which was addressed by imputing them with the mean value of the respective column feature. This ensures that the information in the entire dataset is leveraged to generate a model that best explains the data.
Approach We use two different machine learning tasks to approach this problem, namely: classification and clustering. In classification we built a model that can accurately classify if a patient has CKD based on their health parameters. And in order to understand if people can be grouped together based on the presence of CKD we have performed clustering on this dataset. Both these approaches provide good insights into the patterns present in the underlying data. Classification This problem can be modeled as a classification task in machine learning where the two classes are: CKD and not CKD which represents if a person is suffering from chronic kidney disease or not respectively. Each person is represented as a set of features provided in the dataset described earlier. We also have ground truth as to if a patient has CKD or not, which can be used to train a model that learns how to distinguish between the two classes. Our training set consists of 75% of the data and the remaining 25% is used for testing. The ratio of CKD to non-CKD persons in the test dataset was maintained to be approximately the similar to the entire dataset to avoid the problems of skewness. Various classification algorithms were employed such as logistic regression, Support Vector Machine (SVM) with various kernels, decision trees and Ada boost so as to compare their performance. While training the model, a stratified K-fold cross validation was adopted which ensures that each fold has the same proportion of labeled classes. Each classifier has a different methodology for learning. Some classifiers assign weights to each input feature along with a threshold that determines the output and updates them accordingly based on the training data. In the case of SVM, kernels map input features into a different dimension which might be linearly separable. Decision tree classifiers have the advantage that it can be easily visualized since it is analogous to a set of rules that need to be applied to an input feature vector. Each classifier has a different generalization capability and the efficiency depends on the underlying training and test data. Our aim is to discover the performance of each classifier on this type of medical information.
Clustering Clustering involves organizing a set of items into groups based on a pre-defined similarity measure. This is an unsupervised learning method that doesn't use the labeled information. There are various popular clustering algorithms and we use k-means and hierarchical clustering to analyze our data. K-means involves specifying the number of classes and the initial class means which are set to random points in the data. We vary the number of groups from 2 to 5 to figure out which maximizes the quality of clustering. Clustering with more than 2 groups also might allow to quantify the severity of Chronic Kidney Disease (CKD) for each patient instead of the binary notion of just having CKD or not. In each iteration of k-means, each person is assigned to a nearest group mean based on the distance metric and then the mean of each group is calculated based on the updated assignment. After a few iterations, once the means converge the k-means is stopped. Hierarchical clustering follows another approach whereby initially each datapoint is an individual cluster by itself and then at every step the closest two clusters are combined together to form a bigger cluster. The distance metric used in both the methods of clustering is Euclidean distance. Hierarchical clustering doesn't require any assumption about the number of clusters since the resulting output is a tree-like structure that contains the clusters that were merged at every time-step. The clusters for a certain number of groups can be obtained by slicing the tree at the desired level. We evaluate the quality of the clustering based on a well known criteria known as purity. Purity measures the number of data points that were classified correctly based on the ground truth which is available to us [5].
Principal Component Analysis Principal Component Analysis (PCA) is a popular tool for dimensionality reduction. It reduces the number of dimensions of a vector by maximizing the eigenvectors of the covariance matrix. We carry out PCA before using K-Means and hierarchical clustering so as to reduce it's complexity as well as make it easier to visualize the cluster differences using a 2D plot.
Results Classification In total, 6 different classification algorithms were used to compare their results. They are: logistic regression, decision tree, SVM with a linear kernel, SVM with a RBF kernel, Random Forest Classifier and Adaboost. The last two classifiers fall under the category of ensemble methods. The benefit of using ensemble methods is that it aggregates multiple learning algorithms to produce one that performs in a more robust manner. The two types of ensemble learning methods used are: Averaging methods and Boosting methods [6].
The averaging method typically outputs the average of several learning algorithms and one such type we used is random forest classifier. On the other hand, a boosting method “combines several weak models to produce a powerful ensemble” [6]. Ada boost is an example of boosting method that we have used.
We found that the SVM with linear kernel performed the best with 98% accuracy in the prediction of labels in the test data. The next best performance was by the two ensemble methods: Random Forest Classifier with 96% and Adaboost 95% accuracy. The next two classifiers were: Logistic regression with 91% and Decision tree with 90%. The classifier with the least accuracy was SVM with a RBF kernel which has about 60% accuracy. We believe that RBF gave lower performance because the input features are already high dimensional and don't need to be mapped into a higher dimensional space by RBF or other non-linear kernels. A Receiver Operating Characteristic (ROC) curve can also be plotted to compare the true positive rate and false positive rate. We also plan to compute other evaluation metrics such as precision, recall and F-score. The results are promising as majority of the classifiers have a classification accuracy of above 90%.
After classifying the test dataset, feature analysis was performed to compare the importance of each feature. The most important features across the classifiers were: albumin level and serum creatinine. Logistic regression classifier also included the ‘pedal edema’ feature along with the previous two features mentioned. Red blood cell feature was included as an important feature by Decision tree and Adaboost classifier.
Clustering After performing clustering on the entire dataset using K-Means we were able to plot it on a 2D graph since we used PCA to reduce it to two dimensions. The purity score of our clustering is 0.62. A higher purity score (max value is 1.0) represents a better quality of clustering. The hierarchical clustering plot provides the flexibility to view more than 2 clusters since there might be gradients in the severity of CKD among patients rather than the simple binary representation of having CKD or not. Multiple clusters can be obtained by intersecting the hierarchical tree at the desired level.
Conclusions We currently live in the big data era. There is an enormous amount of data being generated from various sources across all domains. Some of them include DNA sequence data, ubiquitous sensors, MRI/CAT scans, astronomical images etc. The challenge now is being able to extract useful information and create knowledge using innovative techniques to efficiently process the data. Due to this data deluge phenomenon, machine learning and data mining have gained strong interest among the research community. Statistical analysis on healthcare data has been gaining momentum since it has the potential to provide insights that are not obvious and can foster breakthroughs in this area.
This work aims to combine work in the field of computer science and health by applying techniques from statistical machine learning to health care data. Chronic kidney disease (CKD) affects a sizable percentage of the world's population. If detected early, its adverse effects can be avoided, hence saving precious lives and reducing cost. We have been able to build a model based on labeled data that accurately predicts if a patient suffers from chronic kidney disease based on their personal characteristics.
Our future work would be to include a larger dataset consisting of of thousands of patients and a richer set of features that shall improve the richness of the model by capturing a higher variation. We also aim to use topic models such as Latent Dirichlet Allocation to group various medical features into topics so as to understand the interaction between them. There needs to be a greater encouragement for such inter-disciplinary work in order to tackle grand challenges and in this case realize the vision of evidence based healthcare and personalized medicine.
References
[1] https://www.kidney.org/kidneydisease/aboutckd
[3] http://www.ncbi.nlm.nih.gov/pubmed/23727169
[4] https://archive.ics.uci.edu/ml/datasets/Chronic_Kidney_Disease
[5] http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
-
-
-
Legal Issues in E-Commerce: A Case Study of Malaysia
More LessElectronic Commerce is the process of buying, selling, tranferring or exchanging products, services and/or information via computer network, including the Internet. In e-commerce environment, just like traditional paper-based commercial transaction, seller present their products, prices and terms to potential buyers. The buyers will consider their options, negotiate prices and terms if (necessary), place order and make payment. E-commerce is growing at a significant rate all over the world due to its efficient business transaction. Despite the fact of the development of e-commerce, there is uncertainty whether the traditional principles of contract law applicable to electronic contract. In formation of e-contract, the parties might disagree to what point and in which country an e-contract is formed. Malaysia as well as other country has enacted legislation on e-commerce in compliance with international organization i.e United Nations Commission on International Trade Law (UNCITRAL). The aim and objective of this paper is to identify the adequacy of existing legislation in Malaysia on e-commerce. This paper will also examines the creation of legally enforceable agreement with regard to e-commerce in Malaysia, digital signature and the uncertainty of where and when the e-contract is formed.
-
-
-
Latest Trends in Twitter from Arab Countries and the World
Authors: Wafa Waheeda Syed and Abdelkader Lattab1. Introduction
Twitter is the micro blogging social media platform which has at most variety of content. The open access to Twitter data with the usage of Twitter APIs has made it a important area of research. Twitter has a useful feature called “Trends” which displays Hot topics or trending information differing for every location. This trending information is evolved by tweets being shared on Twitter in a particular location. But Twitter limits the trending information to current tweets, as the algorithm for finding trends is concentrated on generating trends in real time, rather than trends summarization of hot topics on daily basis. Thus a clear summarization of the contemporary trending information is missing and is much needed. Latest Twitter Trends - our application discussed in this paper, is built to get the aggregate of hot topics on Twitter for Arab Countries and the World. This is a real time application with summarization of Hot topics over time. It enables users to study the summarization of twitter trends by location with the help of a Word Cloud. The tool also enables the user to click on the particular trend, which will allow the user to navigate and search through Twitter Search - also in real time. This tool also overcomes a drawback of Twitter trending information, in addition to the Twitter trend algorithm. The trends differ for different languages in different locations and are often mixed. For eg, if #Eid-ul-Adha is trending in Arab countries, عيد الأضحى# is also trending. This application focuses on consolidating the trends in Arabic and English, which have the same meaning and display only one trending topic, instead of two same topics in different language. This application also gives an estimation of the different kind of Twitter users, analyzing the percentage of tweets made by Male and Female in that location.
2. Trends data gathering
Twitter APIs give developers access to the real time data comprising of tweets & trends. The Twitter REST API is used by the tool - Latest Twitter Trends, to connect and get trending data from Twitter. The API is used to authenticate and establish connection with Twitter and also returns Twitter trending data in JSON format. Python programming language is used to write scripts to gather data from Twitter. A Data Crawling Script is developed for connecting with Twitter API by authenticating the credentials generated by twitter on creating an application from the app.twitter.com. The Customer Key, Customer Secret, Access Token, Access Token Secret are the credentials used to perform authentication by Twitter. The data returned by Twitter is in JSON (JavaScript Object Notation) format and the Python Data Crawling Script is commanded to handle the JSON files and create a CSV database. This High Level Gathering of Data comprises of the following: Python data crawling script connects and authenticates with Twitter API and gets trending places data in JSON format from Twitter. The data in JSON format is stored in to our tool database as a CSV file. The Twitter data gathered is all the trending location/places with the WOIED (Where On Earth ID). The WOEID is used as a key to get Twitter trending topics location by location - in real time using the Twitter REST API. The trends for every location are also returned to the tool in JSON format, which is again changed converted to CSV for saving in the tool database. This CSV file for Trends is appended every time a new trending data is collected from twitter. Another CSV file is maintained in the Database which holds only the current information for all trending places - for later use. Natural language processing is done on trends by location CSV data, dictionary, to consolidate and consider Arabic and English Trending topics as one. The results are stored in CSV file and will be used for the hot topic identification.
3. Hot topic identification
After the High level Data Gathering, CSV files containing data are used as a Database for generating Word Cloud using D3.js. This trending data is processed by calculating the number of occurrences to give an estimate of which trending topic was trending for a long time. The frequency is taken as the count value for trending topic and a word cloud is generated. This algorithm for calculating frequency is a python script, written mainly for Word Cloud Data Crawling. This word cloud data crawling script takes the Trends by Location data as input and generates a huge database of trends by cities in JSON files. This word cloud crawling script gives output in the JSON files to be stored with key as the trend topic and value as the frequency of the trend occurrence.
4. Architecture
Figure 1: Latest Trends in Twitter Application Architecture The python scripts for data crawling and word cloud crawling are sued to connect with Twitter, gather data, process and store in a database. The D3.js and Google fusion table API are used for displaying the application results. Google Fusion Table API is used to create a Map containing current trends by location – geo-tagged on the map. Java program is used as a dedicated project to connect & authenticate with Google API and clear old fusion table data to import new updated rows in to the Google Fusion Table. Python script Tagcloud.py is used to generate cities. JSON with trending topics from the Trends.csv file. These files from the database for generating word cloud using D3.js, individually for every city/location. Fusion table is used to visualize the trending information from Twitter. A java program along with Google API is used to authenticate and connect. Also to delete previous information in fusion table and update/import new records of data.
5. Results
The data crawling script establishes connection with Twitter and returns a JSON format as in Fig. 2. This data is processed and saved as a CSV in to our application database for later use. Figure 2: Trends Data Output from Twitter in JSON format The word cloud crawling script generates key value pairs of processed trending data from the database. The key containing trending topic and the value containing the frequency of the trending topic's occurrence. The Fig. 3 displays the JSON dataset used for generating word cloud. Figure 3: JSON data of the processed trending data The word cloud is generated using the D3.js library and is used to display summarized trending data to the user. Figure 4 shows the word cloud result for London country. Figure 4: Word cloud for trending data.
-
-
-
Mitigation of Traffic Congestion Using Ramp Metering on Doha Expressway
Authors: Muhammad Asif Khan, Ridha Hamila and Khaled Salah ShaabanRamp metering is the most effective and widely implemented strategy for improving traffic flow on freeways by restricting the number of vehicles entering a freeway using ramp meters. A ramp meter is a traffic signal programmed with a much shorter cycle time in order to allow a single vehicle or a very small platoon of vehicles (usually two or three) per green phase. Ramp metering algorithms defines the underlying logic that calculate the metering rate. Ramp meters are usually employed to control vehicles at the on-ramp to enter freeway (mainline) to mitigate the impact of the ramp traffic on the mainline flow. However ramp meters can also be used to control traffic flow from freeway to freeway. The selection of appropriate ramp metering strategy is based on the needs and goals of the regional transportation agency. Ramp meters can be controlled either locally (isolated) or system-wide (coordinated). Locally controlled or isolated ramp meters control vehicles access based on the local traffic conditions on single ramp or freeway segment to reduce congestion locally near the local ramp. System-wide or coordinated controlled ramp meters are used to improve traffic conditions on a freeway segment or the entire freeway corridor. Ramp Meters can be programmed either fixed time or traffic responsive. Fixed metering uses pre-set metering rates with a defined schedule based on some historical traffic data. Fixed or pre-time metering addresses the recurring congestion problem, but fails in case of non-recurring congestion. Traffic responsive metering uses present traffic conditions to adjust its metering rate. Traffic data is collected using loop detectors or any other surveillance system on real time. Traffic responsive control can be implemented in both isolated and coordinated ramp meters. Some known traffic responsive algorithms include Asservissement Linéaire d'Entrée Auotroutière (ALINEA), Heuristic Ramp Metering Coordination (HERO), System Wide Adaptive Ramp Metering (SWARM), fuzzy logic, Stratified zone algorithm, Bottleneck algorithm, Zone algorithm, HELPER algorithm and Advanced Real Time Metering (ARM algorithm). These various algorithms are developed in various regions of the world and some of them are evaluated for quite a long period of time. However the difference in traffic parameters, driver's behaviors, road geometries and other parameters can affect the performance of the algorithm when implemented in a new location. Hence it is necessary to investigate the performance of the ramp metering strategy prior to physical deployment. In this work, we chose Doha Expressway to deploy ramp metering for improvement in traffic conditions. Doha Expressway is a six-lane highway in Qatar that link the North of Doha to the South. The highway can be accessed through several on-ramps at different locations. The merging of ramp traffic to the freeway often causes congestion on the highway in several ways. It increases traffic density on the highway, reduce vehicles speed and causes vehicles to change lanes in the merging area. Hence in this research we first investigated the impact of ramp traffic on the mainline flow and identified the potential bottlenecks. Then ramp meters were installed at some of the on-ramps to evaluate the performance of each to improve the traffic flow on the mainline. The outcome of this study is to select the optimum metering strategy for each on-ramp with proposed modifications if required. Extensive simulations are carried out in PTV VISSIM traffic micro simulation software. The simulator is calibrated using real time traffic data and geometrical information.
-
-
-
Distributed Multi-Objective Resource Optimization for Mobile-Health Systems
Authors: Alaa Awad Abdellatif and Amr MohamedMobile-health (m-health) systems leverage wireless and mobile communication technologies to promote new ways to acquire, process, transport, and secure the raw and processed medical data. M-health systems provide the scalability needed to cope with the increasing number of elderly and chronic disease patients requiring constant monitoring. However, the design and operation of such pervasive health monitoring systems with Body Area Sensor Networks (BASNs) is challenging in twofold. First, limited energy, computational and storage resources of the sensor nodes. Second, need to guarantee application level Quality of Service (QoS). In this paper, we integrate wireless network components, and application-layer characteristics to provide sustainable, energy-efficient and high-quality services for m-health systems. In particularly, we propose an Energy-Cost-Distortion (E-C-D) solution, which exploits the benefits of medical data adaptation to optimize transmission energy consumption and cost of using network services. However, at large scale networks and due to heterogeneity of wireless m-health systems, centralized approach becomes less efficient. Therefore, we present a distributed cross-layer solution, which is suitable for heterogeneous wireless m-health systems and scalable with the network size. Our scheme leverages Lagrangian duality theory and enables us to find efficient trade-off among energy consumption, network cost, and vital signs distortion, for delay sensitive transmission of medical data over heterogeneous wireless environment. In this context, we propose a solution that enables energy-efficient high-quality patient health monitoring to facilitate remote chronic disease management. We propose a multi-objective optimization problem that targets different QoS metrics, namely, signal distortion, delay, and Bit Error Rate (BER), as well as monetary cost and transmission energy. In particularly, we aim to achieve the optimal trade-off among the above factors, which exhibit conflicting trends.
The main contributions of our work can be summarized as follows:
- (1) We design a system for EEG health monitoring systems that achieves high performance by properly combining network functionalities and EEG application characteristics.
- (2) We formulate a cross-layer multi-objective optimization model that aims at adapting and minimizing, at each PDA, the encoding distortion and monetary cost at the application layer, as well as the transmission energy at the physical layer, while meeting the delay and BER constraints.
- (3) We use geometric program transformation to convert the aforementioned problem into a convex problem, for which an optimal, centralized solution can be obtained.
- (4) By leveraging Lagrangian duality theory, we then propose a scalable distributed solution. The dual decomposition approach enables us to decouple the problem into a set of sub-problems that can be solved locally, leading to a scalable distributed algorithm that converges to the optimal solution.
- (5) The proposed distributed algorithm for EEG based m-health systems is analyzed and compared to the centralized approach.
Our results show the efficiency of our distributed solution, its ability to converge to the optimal solution and to adapt to varying network conditions. In particular, simulation results show that the proposed scheme achieves the optimal trade-off between energy efficiency and QoS requirements of health monitoring systems. Moreover, it offers significantly savings in the objective function (i.e., E-C-D utility function), compared to solutions based on equal bandwidth allocation.
-
-
-
Multimodal Interface Design for Ultrasound Machines
Authors: Yasmin Halwani, Tim S.E. Salcudean and Sidney S. FelsSonographers, radiologists and surgeons use ultrasound machines on a daily basis to acquire images for interventional procedures, scanning and diagnosis. The current interaction with ultrasound machines relies completely on physical keys and touch-screen input. In addition to not having a sterile interface for interventional procedures and operations, using the ultrasound machine requires a physical nearby presence of the clinician to use the keys on the machine, which restricts the clinician's free movement and natural postures to apply the probe on the patient and often forces uncomfortable ergonomics for prolonged periods of time. According to surveys being continuously conducted on the incidence of work-related musculoskeletal disorders (WRMSDs) for the past decade, up to 90% of sonographers experience WRMSDs across North America during routine ultrasonography. Repetitive motions and prolonged static postures are among the risk factors of WRMSDs, which both can be significantly reduced by introducing an improved interface for ultrasound machines that does not completely rely on direct physical interaction. Furthermore, the majority of physicians who perform ultrasound-guided interventions hold the probe with one hand while inserting a needle with the other, which makes ultrasound machine parameters adjustment unreachable without external assistance. Similarly, surgeons' hands are typically occupied with sterile surgical tools and are unable to control ultrasound machine parameters independently. The need for an assistant is suboptimal as it sometimes difficult for the operator or surgeon to communicate a specific intent during a procedure. Introducing a multimodal interface for ultrasound machine parameters that improves the current interface and is capable of hands-free interaction can bring an unprecedented benefit to all types of clinicians who use ultrasound machines, as it will contribute in reducing strain-related injuries and cognitive load experienced by sonographers, radiologists and surgeons and introduce a more effective, natural and efficient interface.
Due to the need for sterile, improved and efficient interaction and the availability of low-cost hardware, multimodal interaction with medical imaging tools is an active research area. There have been numerous studies that explored speech, vision, touch and gesture recognition to interact with both pre-operative and interventional image parameters during interventional procedures or during surgical operations. However, research that target multimodal interaction with ultrasound machines has not been sufficiently explored and is mostly limited to augmenting one interaction modality at a time, such as the existing commercial software and patents on enabling ultrasound machines with speech recognition. Given the wide range of settings and menu navigation required for ultrasound image acquisition, there is potential improvement in the interaction by expanding the existing physical interface with hands-free interaction methods such as voice, gesture, and eye-gaze recognition. Namely, it will simplify the image settings menu navigation required to complete a scanning task by the system's ability to recognize the user's context through the additional interaction modalities. In addition, the user will not be restricted by a physical interface and will be able to interact with the ultrasound machines completely hands-free using the added interaction modalities, as explained earlier in the case of sterile environments in interventional procedures.
Field studies and interviews with sonographers and radiologists have been conducted to explore the potential areas of improvement of current ultrasound systems. Typical ultrasound machines used by sonographers for routine ultrasonography tend to have an extensive physical interface with keys and switches all co-located in the same area as the keyboard for all possible ultrasonography contexts. Although the keys are distributed based on their typical frequency of use in common ultrasonography exams, sonographers tend to glance at the keys repeatedly during a routine ultrasound session, which takes away from their uninterrupted focus on the image. Although it varies based on the type of the ultrasound exam, typically an ultrasound exam takes an average of 30 minutes, requiring a capture of multiple images. For time-sensitive tasks, such as imaging anatomical structures in constant motion, the coordination between the image, keys selection, menu navigation and probe positioning can be both time-consuming and distracting. Interviewed sonographers also reported their discomfort with repeated awkward postures and their preference for a hands-free interface in cases where they have to position the ultrasound probe at a faraway distance from where the ultrasound physical control keys are located, as in the case with immobile patients or patients with high BMI.
Currently, there exist available commercial software that addresses the repeated physical keystrokes issue and the need for a hands-free interface. Some machines provide a context-aware solution in a form of customizable software to automate steps in ultrasound exams, which reported to have significantly decreased keystrokes by 60% and exam time by 54%. Other machines provide voice-enabled interaction with ultrasound machines to reduce uncomfortable postures by sonographers trying to position the probe and reach for the physical keys on the machine. Interviewed sonographers frequently used the context-aware automated interaction software system with ultrasound machines during their ultrasound exams, which shows a potential for the context-aware feature that multimodal interaction systems can offer. On the other hand, sonographers did not prefer using voice commands as a primary interaction modality with ultrasound machines in addition to the existing physical controls, as an ultrasound exam involves a lot of communication with the patient and relying on voice input might cause misinterpretations of sonographer-patient conversations to be commands directed to the machine instead. This also leads to the conclusion that there is a need for voice-enabled systems augmented with other interaction modalities to be efficiently used when needed and not be confused with external voice interaction.
This study aims to explore interfaces for controlling ultrasound machine settings during routine ultrasonography and interventional procedures through multi-modal input. The main goal is to design an efficient timesaving and cost-effective system that minimizes the amount of repetitive physical interaction with the ultrasound machine in addition to providing a hand-free mode to reduce WRMSDs and allow direct interaction with the machine in sterile conditions. Achieving that will be done through additional field studies and prototyping followed by user studies to assess the developed system.
-
-
-
Pokerface: The Word-Emotion Detector
Authors: Alaa Khader, Ashwini Kamath, Harsh Sharma, Irina Temnikova, Ferda Ofli and Francisco GuzmánEvery day, humans interact with text spanning from different sources such as news, literature, education, and even social media. While reading, humans process text word by word, accessing the meaning of a particular word from the lexicon, and when needed, changing its meaning to match the context of the text (Harley, 2014). The process of reading can induce a range of emotions, such as engagement, confusion, frustration, surprise or happiness. For example, when readers come across unfamiliar jargon, this may confuse them, as they try to understand the text.
In the past, scientists have addressed the emotion in text from a writer's perspective. For example the field of Sentiment Analysis, aims to detect the emotional charge of words, to infer the intentions of the writer. However, here we propose the reverse approach: detect emotions produced on readers while processing text.
Detecting which emotions are induced by reading a piece of text can give us insights about the nature of the text itself. A word-emotion detector can be used to assign specific emotions experienced by readers to specific words or passages of text. This area of research has never been explored before.
There are many potential applications to a word-emotion detector. For example, a word-emotion detector can be used to analyze how passages in books, news or social media are perceived by readers. This can guide stylistic choices to cater for a particular audience. In a learning environment, it can be used to detect the affective states and emotions of students, so as to infer their level of understanding which can be used to provide assistance to students over difficult passages. In a commercial environment, it can be used to detect reactions to wording in advertizements. In the remainder of this report, we detail the first steps we followed to build a word-emotion detector. Moreover, we present the details of our system developed during QCRI's 2015 Hot Summer Cool Research internship program, as well as the initial experiments. In particular, we describe our experimental setup in which viewers watch a foreign language video with modified subtitles containing deliberate emotion inducing changes. We analyze the results and provide discussion about the future work.
The Pokerface System
A Pokerface is an inscrutable face that reveals no hint of a person's thoughts or feelings. The goal of the ‘Pokerface’ project is to build a word-emotion detector that works even if no facial movements are present. To do so, the Pokerface system uses a unique symbiose of the latest consumer-level technologies such as: eye-tracking to detect words that are being read; electroencephalography (EEG) to detect brain activity of the reader; and facial-expression recognition (FER) to detect movement in a reader's face. We then classify the brain activity and facial movements detected into emotions using Neural Networks.
In this report, we present the details of our Pokerface system, as well as the initial experiments done during QCRI's 2015 Hot Summer Cool Research internship program. In particular, we describe the setup in which viewers watch a foreign language video with subtitles containing deliberate emotion inducing changes.
Methodology
To detect emotions experienced by readers as they read text, we used different technologies. FER and EEG are used to detect emotional reactions through changes in facial expressions and brainwaves, while eye-tracking is used to identify the stimulus (text) to the reaction detected. A video interface was created to run the experiments. Below we describe each of them independently, and how we used them in the project.
EEG
EEG is the recording of electrical activity along the scalp (Niedermeyer and Lopes da Silva, 2005). EEG measures voltage fluctuations resulting from ionic current flows within the neurons of the brain. EEG is one of the few non-intrusive techniques available that provides a window on physiological brain activity. EEG averages the response from many neurons as they communicate, measuring the electrical activity by surface electrodes. We can then use the brain activity of a user to detect their emotional status.
Data Gathering
In our experiments, we used the Emotiv | EEG EPOC neuroheadset (2013), which has 14 EEG channels plus two references, inertial sensors, and two gyroscopes. The raw data from the neuroheadset was parsed with the timestamps for each sample.
Data Cleaning and Artifact Removal
After retrieving the data from the EEG, we need to remove “artifacts” which are changes in the signals that do not originate from neurons (Vidal, 1977), such as ocular movements, muscular movements, as well as technical noise. To do so, we used the open source toolbox EEGlab (Delorme & Makeig, 2004) to remove artifacts and filtering (removing the 4–45 Hz line noise).
ERP Collection
We decided to consider remaining artifacts as random noise and move forward with extracting Event Related Potentials (ERPs), since all of other options we had found required some level of manual intervention. ERPs are the relevant sections of our EEG data with regards to stimuli and the subjects' reaction time. To account for random effects form the artifacts, we averaged the ERPs over different users and events. To do so, we used EEGlab's plugin ERPlab, to add events codes to our continuous EEG data based on stimulus time.
Events
Our events were defined as textual modifications in subtitles, designed to induce emotions of confusion, frustration or surprise. The time at which the subject looks at a word was marked to be the stimulus time (st) for that word, and the reaction time was marked to be st+800 ms, because we rarely see a reaction to a stimulus 800 ms after its appearance (Fischler and Bradley, 2006).
The ERPs were obtained as the average of different events corresponding to the same condition (control or experimental).
Eye-Tracking
An eye-tracker is an instrument to detect the movements of the eye. Based on the nature of the eye and human vision, the eye-tracker identifies where a user is looking by shining a light that will be reflected into the eye, such that the reflection will be captured by image sensors. The eye-tracker will then measure the angle between the cornea and pupil reflections to calculate a vector and identify the direction of the gaze.
In this project, we used the EyeTribe eye-tracker to identify the words a reader looked at while reading. It was set up in a Windows machine. Before an experiment, the user needs to calibrate the eye-tracker. Recalibration is necessary every time the user changes their sitting position. While the eye-tracker is running, Javascript and NodeJS were used to create a function that extracts and parses the data from the machine and prints it into a text file. This data includes the screen's x and y coordinates of the gaze; the timestamp; and an indicator of whether the gaze point is a fixation or not. The data is received at a rate of 60fps. The gaze points are used to determine which words are looked at at any specific time.
Video Interface
In our experiments, each user was presented with a video with subtitles. To create the experimental interface, we made different design choices based on previous empirical research. Therefore, we used Helvetica font, given its consistency across all platforms (Falconer, 2011), and used the font size 26 given that it improves the readability of subtitles on large desktops (Franz, 2014). We used Javascript to detect the location of each word that was displayed on the screen.
After gathering the data the experiment, we used an off-line process to detect the “collisions” between the eye-tracker gaze points and the words displayed to the user. To do so, we used both time information and coordinate information. The result was a series of words annotated with the specific time-spans in which they were looked at.
FER
Facial Expression Recognition (FER) is the process of detecting an individual's emotion by accessing their facial expressions in an image or video. In the past, FER has been used for various purposes, including psychological studies, tiredness detection, facial animation and robotics, etc.
Data Gathering
We used the Microsoft Kinect with the Kinect SDK 2.0 for capturing the individual's face. The data extracted from the Kinect provided us with color and infrared images, as well as depth data. However, for this project we only worked with the color data. The data from the Kinect was saved as a sequence of color images, recorded at a rate of 30 frames per second (fps). The code made use of the process of multithreading to ensure high fps, and low memory usage. Each image frame was assigned a timestamp in milliseconds, which was saved in a text file.
Feature Extraction
After having extracted the data from the Kinect, the images were processed so as to locate the facial landmarks. We tested the images with Face++ which is a free API for face detection, recognition and analysis. Using Face++, we were able to locate 83 facial landmarks on the images. The data obtained from the API was the name of the landmark along with it's x and y-coordinates.
The next step involved obtaining Action Units (AUs) by using the facial landmarks located through Face++. Action Units are the actions of individual muscles or groups of muscles, such as, raising the outer eyebrow, or stretching of lips etc (Cohn et al. 2001). To determine which AUs to use for FER, as well as how to calculate them, Tekalp and Ostermann's (2000) was taken as a reference.
Classification
The final step of the process was classifying the image frames into one of the eight different emotions - happiness, sadness, fear, anger, disgust, surprise, neutral and confused. We used the MATLAB Neural Network toolkit (MathWorks, Inc., 2015) and designed a simple feed-forward neural network with backpropagation.
Pilot results
EEG
In our pilot classification study, we used we experimented with the Alcoholism data used in Kuncheva and Rodriguez (2012) paper, from the UC Irvine (UCI) machine learning repository (2010) which contains ERP raw data for 10 alcoholic subjects and 10 sober subjects. We extracted the features using the interval feature extraction. Total of 96K Features were extracted from each subject's data. We Achieved around 98% accuracy with the training data.
FER
We experimented on three different individuals as well as several images from the Cohn-Kanade facial expressions database. It was found that the application had roughly 75–80% accuracy, and this accuracy could further be improved by adding more data into the training set. It was also observed that the classifier was more accurate when classifying certain emotions than some others. For example, the images depicting happiness were classified accurately more than images that depicted any other emotion. The classifier had difficulty distinguishing between fear, anger and sadness.
Conclusion
In this paper, we presented Pokerface, a word-emotion detector that can detect the emotion of users as they read text. We built a video interface that displays subtitled videos and used the EyeTribe eye-tracker to identify the word a user is looking at a certain time in the subtitles. We used the Emotiv Epoc headset to obtain EEG brainwaves from the user, and the Microsoft Kinect to obtain their facial expressions, and extracted features from both. We used Neural Networks to classify both the facial expressions and the EEG brainwaves into emotions. In the future.
Future directions of work include to improve the accuracy of the FER and EEG emotion classification components. Furthermore, the EEG results can be improved by exploring additional artifact detection and removal techniques. Furthermore, we want to integrate the whole pipeline in a seamless application that allows effortless experimentation.
Once the setup is streamlined, Pokerface can be used to explore many different applications to optimize users' experiences in education, news, advertizing, etc. For example, the word-emotion detector can be utilized in Computer-assisted learning to provide students with virtual affective support, such as detecting confusion and providing clarifications.
-