- Home
- Conference Proceedings
- Qatar Foundation Annual Research Conference Proceedings
- Conference Proceeding
Qatar Foundation Annual Research Conference Proceedings Volume 2016 Issue 1
- Conference date: 22-23 Mar 2016
- Location: Qatar National Convention Center (QNCC), Doha, Qatar
- Volume number: 2016
- Published: 21 March 2016
521 - 540 of 656 results
-
-
Robotic Assistants in Operating Rooms in Qatar, Development phase
Authors: Carlos A. Velasquez, Amer Chaikhouni and Juan P. WachsObjectives
To date, no automated solution can anticipate or detect a request from a surgeon during surgical procedures without requiring the surgeon to alter his/her behavior. We are addressing this gap by developing a system that can pass the correct surgical instruments as required by the main surgeon. The study uses a manipulator robot that automatically detects and analyzes explicit and implicit requests during surgery, emulating a human nurse handling surgical equipment. This project constitutes an important step in a research project that involves other challenges related to operative efficiency and safety.
At the 2016 QF Annual Research Forum Conference, we would like to present our preliminary results in the execution of the project. First, a description of the methodology used to acquire surgical team interactions during several cardiothoracic procedures observed at the HMC Heart Hospital, followed by the analysis of the data acquired. Secondly, experimental results of actual human-robot interaction tests emulating the human nurse behavior.
Methods
In order to study the interactions at the operating room during surgical procedures, a model of analysis was structured and executed for several cardiothoracic operations captured with MS Kinect V2 sensors. The data obtained was meticulously studied and relevant observations stored in a database to facilitate the analysis and comparison of events representing the different interactions among the surgical team.
Surgical Annotations
Two or three consecutive events identify in time a sequence of manipulation. For the purpose of developing a structure of annotations, each record in the database can be divided on information containing the time of occurrence of the event counted from the beginning of the procedure, information describing how the manipulation event occurs, information related to the position of the instrument in the space around the patient, and a final optional component with brief additional information that might help to understand the event considered and its relations to the surgical operations flow.
Figure 1: Operating room at HMC Heart Hospital. (a) Kinetic Sensor location (b) Surgical team and instrument locations for a cardio thoracic procedure as viewed from the sensor
1.1. Information containing the time of occurrence of the sequence
Timing information of sequences is basically described by time stamps corresponding to the occurrence of its initial and final events. Some special sequences might include an additional intermediate event that we call as ‘Ongoing’. Additionally, all events are counted as they occur. The status of this counting process is also included as a field in the time occurrence group.
1.2. Information describing how the manipulation sequence occurs
A careful observation of the routines performed at the surgical room allowed us to identify different sequences of events that can be classified into three general categories that describe how the manipulation event occurs:
Commands that correspond to the requests of instruments or operations addressed to supporting staff. These requests can be discriminated as verbal, non-verbal or a combination of both. Commands are not exclusively made by surgeons, sometimes the nurse handling instruments requests actions from the circulatory staff too.
Predictions made by the supporting staff when selecting and preparing instruments in advance to handle them to the surgeon, Fig. 3. These predictions can be divided into right, or wrong depending on the surgeon's decision to accept or reject the instrument when it is offered to him. Sometimes an instrument whose use was predicted incorrectly at a given time might be required by the surgeon in a near following sequence. We classified this kind of event as a partially wrong prediction, Fig. 4.
Actions that correspond to independent or coupled sequences necessary for the flow of the surgical procedure. For instance, as illustrated in the start and end events of Fig. 2, the surgeon picks up himself an instrument from the Mayo tray. The detailed observation of all relevant actions is essential to understand how commands are delivered, what intermediate events are triggered in response, and how the instruments are handled in space and time between its original and final location.
The information presented in the Table 1 summarizes the most common sequences of events found during the surgical procedures analyzed. The table also shows how the roles of the surgical team are distributed in relation to the events considered.
1.3. Information related to the instrument and its position in the space around the patient
The instrument is identified simply by its name. Several instances of the same instrument are used during surgery, but for annotation purposes we refer to all of them as if only one were available. In cases where some physical characteristic differentiate the instrument from others of the same kind such as in the case of size, a different instrument name is selectable. In Table 2, for example, a ‘Retractor’ is differentiated from ‘Big Retractor’. An instrument can be located at any of the possibilities listed under the label ‘Area’ in Table 2 as it can be verified from Fig. 1. In case, one of the members of the surgical team holds the instrument in one or both of his hands, the specification of the exact situation can be obtained by selecting any of the options under de label ‘Hands’ in Table 2.
1.4. Additional information
Any remarkable information related to the event can be included in this unlimited field. For example, at some point the nurse can offer two instruments simultaneously to the surgeon. This is a rare situation since usually the exchange is performed instrument by instrument.
Figure 2: Example of an action: The surgeon picks up directly an instrument
Figure 3: The nurse anticipates the use of one instrument
Figure 4: A Partially wrong prediction: One of two instruments is accepted
Table 1 Description of events and relations to the roles of surgical staff
Table 2 Information about location of the instrument
2.Annotation Software Tool
Based on libraries and information provided by Microsoft we wrote code in order to use MS Kinect Studio to annotate the surgical procedures. The use of Kinect Studio has several advantages if compared to other tools we evaluated, such as extreme precision to identify the length and timing of a sequence, and efficiency in the analysis of simultaneous streams of information. Figure 3 shows the screen presented by Kinect Studio when annotations are being made for the color stream of the surgical recording used as example in the same illustration. The color stream is rendered at a speed of 30fps, which means that every 0.03S in average there is a frame available to annotate if necessary. The blue markers on the Timeline are located at events of interest. On the left side of the screen, a set of fields that correspond to the information of interest is displayed to be filled for each event of interest.
Figure 5: Annotations in Kinect Studio are introduced as metadata fields for the Timeline Marker
The collection of the different entries to describe the interaction are written as an output text file that can be processed with conventional database software tools. The structure of the records in the resultant text file, is presented in Fig. 6. The set of annotations obtained within MS Kinect Studio is exported as a text table that follows the model illustrated in Fig. 6. The structure of the record presented contains the events and relations to the roles of surgical staff listed in Table 1 as well as the fields of information for the instrument as presented in Table 2.
Figure 6: Structure of the annotation record obtained from the metadata associated to the timeline markers in Kinect Studio
Figure 7: Annotations Database as processed within Excel for a surgery of 30 minutes
The annotations database obtained within Kinect Studio for the surgical procedure1 used as example in this report was exported to MS Excel for analysis. A partial image of this database is presented in Fig. 7 where it is possible to appreciate some of the first sequences stored. The colors in the third column are used to differentiate events that belong to the same sequence. These colors are chosen arbitrarily. After the final event of the sequence is identified, the same color is available to signal a new sequence. In total the database of this example contains 259 records for a period of 30 minutes. Queries performed by using Database functionalities generate the results for predictions and commands illustrated in Fig. 8 and Fig. 9.
Figure 8 Predictions: (a) Discrimination as right, wrong or partially wrong. (b) Instruments received by the surgeon (c) Nurse hand holding instruments (d) Instruments rejected by the surgeon (e) Time elapsed while the prediction is performed.
Figure 9 Commands: (a) Discrimination as verbal, nonverbal or combination of both verbal and nonverbal (b) Instruments requested (c) Time elapsed in verbal commands (d) Time elapsed in nonverbal commands (e) Time elapsed while the instrument is requested
Experimental Setup
As a preliminary step to operate a manipulator robot as robotic nurse, surgical personnel at the HMC Heart Hospital are requested to perform a mock knot as illustrated in the Fig. 10 on a synthetic model. During the performance of this task, a Kinect sensor captures body position, hand gestures as well as voice commands. This information is processed by a workstation running windows compatible software that controls the robot to react passing the requested surgical instrument to the subject so that it can be used to complete the task.
Figure 10: (a) Mock knot used as preliminary test of interaction (b) Robotic Set up for experiments at the HMC Heart Hospital
The robot used is a Barrett Robotic manipulator with seven degrees of freedom with as shown in Fig. 10. This FDA approved robot is one of the most advanced robotic systems known as safe to operate with human subjects since it has force sensing capabilities that are used to avoid potential impacts.
Summary
As part of development of a NPRP project that studies the feasibility of having robotic nurses at the operating room that can recognize verbal and nonverbal commands to deliver instruments from the tray to the hand of the surgeon, we have studied interaction activities of surgical teams performing cardio thoracic procedures at the Heart Hospital in Doha. Using state of the art sensor devices we achieved to capture plenty of information that has been carefully analyzed and annotated into databases. We would like to present at the 2016 QF Annual Research Forum Conference our current findings as well as the results of Human interaction tests with a manipulator robot acting as a robotic nurse in the execution of a task that involves gesture/verbal recognition, recognition of the instrument and safe delivery to the surgeon.
1 Wedge Lung Resection. In this procedure the surgeon removes a small wedge-shaped piece of lung that contains cancer and a margin of healthy tissue around the cancer.
-
-
-
Design and Performance Analysis of VLC-based Transceiver
Authors: Amine Bermak and Muhammad Asim AttaBackground
As the number of handheld devices increases, wireless data traffic is expanding exponentially. With the ever increasing demand of higher data rate, it will be very challenging for the system designers to meet the requirement using limited Radio Frequency (RF) communication spectrum. One of the possible remedies to overcome this problem is the use of freely available visible light spectrum [1].
Introduction
This paper proposes an indoor communication system by utilizing Visible Light Communication (VLC). VLC technology utilizes visible light spectrum (750–380 nm) not only for illumination but with an additive advantage of data communication [2]. Visible Light Communication exploits high frequency switching capabilities of Light Emitting Diodes (LEDs) to transmit data. A receiver generally containing a Photo Diode receives signals from optical source and can easily decode the information being transmitted. In practical systems, usage of CMOS imager as a receiver containing an array of Photo Diodes is preferred over a single Photo Diode. Such receiver will enable multi-target detection and multi-channel communication resulting in more robust transceiver architecture [3].
Method
This work demonstrates a real-time transceiver implementation for Visible Light Communication on FPGA. A Pseudo Noise (PN) sequence is generated that will act as input data for the transmitter. Direct Digital Synthesizer (DSS) is implemented for the generation of carrier signal for modulation purpose [4]. Transmitter utilizes On-Off-Keying (OOK) for modulation of incoming data due to its simplicity [5]. The modulated signal is then converted into analog form using Digital-to-Analog (DAC) converter. An analog driver circuit is connected with digital transmitter which is capable of driving an array of Light Emitting Diodes (LEDs) for data transmission. Block level architecture of VLC Transmitter is shown in Fig. 1.
The receiver architecture uses analog circuitry including Photo Diodes for optical detection and Operational Amplifiers for amplification of received signal. Analog-to-Digital (ADC) conversion is performed before transmitting data back to FPGA for demodulation and data reconstruction. Figure 2 demonstrates the architecture of VLC Receiver.
Results and Conclusion
The system is implemented and tested using Xilinx Spartan 3A series FPGA [6]. Basic transceiver implementation utilizes data rate of 1Mbps with a carrier frequency of 5 MHz. However, in VLC, data rate and carrier frequency directly affects the optical characteristics including color and intensity of LEDs. Therefore, different data rates and modulation frequencies are evaluated for optimum data transmission with minimal effects on optical characteristics of LEDs. System complexity in terms of hardware resources and performance analysis including Bit Error Rate (BER) under varying conditions is also compared.
Results demonstrate that it is feasible to establish a low data rate communication link for indoor applications ranging up to 10 m using commercially available LEDs. Integrating a CMOS imager at receiver end will enable a VLC based Multiple-Input-Multiple-Output (MIMO) communication link that can serve multiple channels, maximizing to 1 channel per pixel [3]. Higher data rates are also achievable by utilizing high data rate modulation techniques (OFDM) at the expense of computational complexity and hardware resource utilization [7].
One of the possible implications of this work could be the implementation of VLC based Indoor positioning and navigation system. It can be a potential benefit for large constructions involved in public interactions including but not limited to hospitals, customer support centers, public facilitation offices, shopping malls and libraries. The system will largely utilize existing infrastructure of indoor illumination with added advantage of data communication.
The study also proposes extension of this work for utilization of VLC in outdoor applications. However, more robust algorithms are required for outdoor communication due to the presence of optical noise and interference caused by weather and atmospheric conditions. Robustness of existing algorithm can be increased by integrating Direct Sequence Spread Spectrum (DSSS) together with OOK for modulation. However, further research work is required to evaluate the performance, complexity and robustness of this system under realistic conditions.
References
[1]Cisco Visual Networking Index, “Global Mobile Data Traffic Forecast Update, 2012–2017,” CISCO, White Paper, Feb. 2013.
[2]Terra, D. Inst. de Telecomun., Univ. de Aveiro, Aveiro, Portugal, Kumar, N. Lourenco, N. Alves, L.N.,” Design, development and performance analysis of DSSS-based transceiver for VLC”, EUROCON - International Conference on Computer as a Tool (EUROCON), 2011 IEEE.
[3]“Image Sensor Communication”. VLCC Consortium.
[4]Xilinx DDS Compiler IP Core “http://www.xilinx.com/products/intellectual property/
dds_compiler.html#documentation”
[5]Nuno Lourenço, Domingos Terra, Navin Kumar, Luis Nero Alves, Rui L Aguiar, “Visible Light Communication System for Outdoor Applications”, 8th IEEE, IET International Symposium on Communication Systems, Networks and Digital Signal Processing
[6]Xilinx Spartan-3A Starter Kit “http://www.xilinx.com/products/boards-and-kits/
hw-spar3a-sk-uni-g.html”
[7]Liane Grobe, Anagnostis Paraskevopoulos, Jonas Hilt, Dominic Schulz, Friedrich Lassak, Florian Hartlieb, Christoph Kottke, Volker Jungnickel, and Klaus-Dieter Langer, “High Speed Visible Light Communication Systems”, IEEE Communications Magazine, December 2013.
-
-
-
A Robust Unified Framework of Vehicle Detection and Tracking for Driving Assistance System with High Efficiency
Authors: Amine Bermak and Bo ZhangBackground
Research by Qatar Road Safety Studies Center (QRSCC) found that the total number of traffic accidents in Qatar was 290,829 in 2013, with huge economical cost that amounts to 2.7 percent of the country's gross domestic product (GDP). There is a growing research effort to improve road safety and to develop automobile driving assistance systems or even self-driving systems like Google project, which is widely expected to revolutionize automotive industry. Vision sensors will play a prominent role in such applications because they provide intuitive and rich information about the road condition. However, vehicle detection and tracking based on vision information is a challenging task because of the large variability of appearance of vehicles, interference of strong light and sometimes fierce weather condition, and complex interactions amongst drivers.
Objective
While previous work usually regards vehicle detection and tracking as separate tasks [1, 2], we propose a unified framework for both tasks. In the detection phase, recent work has mainly focused on building detection systems based on robust feature sets such as histograms of oriented gradients (HOG) [3] and Harr-like features [4] rather than just simple features such as symmetry or edges. However, these robust features involve heavy computational requirements. In this work, we propose an algorithmic framework designed to target both high efficiency and robustness while keeping the computational requirement at an acceptable level.
Method
In the detection phase, in order to reduce the processing latency, we propose to use a hardware-friendly corner detection method obtained from accelerated segment test feature (FAST) [5], which determine interest corners by simply comparing each pixel with its 9 neighbors. If there are contiguous neighboring pixels that are all brighter or darker than a center pixel, it is marked as a corner point. Fig.1 shows the result of FAST corner detector on a real road image. We use recent Learned Arrangements of Three Patch Codes (LATCH) [6] as corner point descriptor. The descriptor is falls into binary descriptor category, but still maintains high performance comparable to histogram based descriptors (like HOG). The descriptors created by LATCH are binary strings, which are computed by comparing image patch-triplets rather than image pixels and as a result, they are less sensitive to noises and minor changes in local appearances. In order to detect vehicles, corners in the successive images are matched to those in the previous images, and thus optical flow at each corner point can be derived according to the movement of corner points. Because of the fact that approaching vehicles in opposite direction will produce a diverging flow, vehicles can be detected from the flow due to ego-motion. Fig.2 illustrates the flow estimated from corner point matching. Sparse optical flow proposed here is quite robust because of the LATCH characteristics, and it also requires much lower computational resources compared to traditional optical flow methods that need to solve time-consuming optimization problem.
Once vehicles are detected, the tracking phase is achieved by matching the corner points. Using Kalman filter for prediction, the matching is fast because probable matched corner point will only be searched near the predicted location. Using corner points for computing sparse optical flow enables the vehicle detection and tracking to be carried-out simultaneously using this unified framework (Fig.3). In addition, this framework allows us to detect newly entered cars in the scene during tracking. Since most image sensors today are based on a rolling shutter integration approach, the image information can be transmitted to the FPGA-based hardware serially and hence the FAST detector and LATCH descriptor could work in a pipeline manner for achieving efficient computation.
Conclusion
In this work, we propose a framework of detecting and tracking vehicles for driving assistance application. The descriptors created by LATCH are binary strings, which are computed by comparing image patch-triplets rather than image pixels and as a result, they are less sensitive to noises and minor changes in local appearances. The vehicles are detected from sparse flow estimated from corner point matching and vehicle tracking is done also with corner point matching with the assistance of Kalman filter. The proposed framework is robust, efficient and requires much lower computational requirements making it a very viable solution for embedded vehicle detection and tracking systems.
References
[1] S. Sivaraman and M. M. Trivedi, “Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis,” IEEE Trans. Intell. Transp. Syst., vol. 14, no. 4,
pp. 1773–1795, 2013.
[2] Z. Sun, G. Bebis, and R. Miller, “On-Road Vehicle Detection: A Review,” vol. 28, no. 5,
pp. 694–711, 2006.
[3] Z. Sun, G. Bebis, and R. Miller, “Monocular precrash vehicle detection: Features and classifiers,” IEEE Trans. Image Process., vol. 15, no. 7, pp. 2019–2034, 2006.
[4] W. C. Chang and C. W. Cho, “Online boosting for vehicle detection,” IEEE Trans. Syst. Man, Cybern. Part B Cybern., vol. 40, no. 3, pp. 892–902, 2010.
[5] E. Rosten and T. Drummond, “Fusing points and lines for high performance tracking,” Tenth IEEE Int. Conf. Comput. Vis. Vol. 1, vol. 2, pp. 1508–1515 Vol. 2, 2005.
[6] G. Levi and T. Hassner, “LATCH: Learned Arrangements of Three Patch Codes,” arXiv, 2015.
-
-
-
On Arabic Multi-Genre Corpus Diacritization
Authors: Houda Bouamor, Wajdi Zaghouani, Mona Diab, Ossama Obeid, Kemal Oflazer, Mahmoud Ghoneim and Abdelati HawwariOne of the characteristics of writing in Modern Standard Arabic (MSA) is that the commonly used orthography is mostly consonantal and does not provide full vocalization of the text. It sometimes includes optional diacritical marks (henceforth, diacritics or vowels).
Arabic script consists of two classes of symbols: letters and diacritics. Letters comprise long vowels such as A, y, w as well as consonants. Diacritics on the other hand comprise short vowels, gemination markers, nunation markers, as well as other markers (such as hamza, the glottal stop which appears in conjunction with a small number of letters, dots on letters, elongation and emphatic markers) which in all, if present, render a more or less exact precise reading of a word. In this study, we are mostly addressing three types of diacritical marks: short vowels, nunation, and shadda (gemination).
Diacritics are extremely useful for text readability and understanding. Their absence in Arabic text adds another layer of lexical and morphological ambiguity. Naturally occurring Arabic text has some percentage of these diacritics present depending on genre and domain. For instance, religious text such as the Quran is fully diacritized to minimize chances of reciting it incorrectly. So are children's educational texts. Classical poetry tends to be diacritized as well. However, news text and other genre are sparsely diacritized (e.g., around 1.5% of tokens in the United Nations Arabic corpus bear at least one diacritic (Diab et al., 2007)).
In general, building models to assign diacritics to each letter in a word requires a large amount of annotated training corpora covering different topics and domains to overcome the sparseness problem. The currently available diacritized MSA corpora are generally limited to the newswire genres (those distributed by the LDC) or religion related texts such as Quran or the Tashkeela corpus. In this paper we present a pilot study where we annotate a sample of non-diacritized text extracted from five different text genres. We explore different annotation strategies where we present the data to the annotator in three modes: basic (only forms with no diacritics), intermediate (basic forms–POS tags), and advanced (a list of forms that is automatically diacritized). We show the impact of the annotation strategy on the annotation quality.
It has been noted in the literature that complete diacritization is not necessary for readability Hermena et al. (2015) as well as for NLP applications, in fact, (Diab et al., 2007) show that full diacritization has a detrimental effect on SMT. Hence, we are interested in discovering the optimal level of diacritization. Accordingly, we explore different levels of diacritization. In this work, we limit our study to two diacritization schemes: FULL and MIN. For FULL, all diacritics are explicitly specified for every word. For MIN, we explore what is a minimum and optimal number of diacritics that needs to be added in order to disambiguate a given word in context and make a sentence easily readable and unambiguous for any NLP application.
We conducted several experiments on a set of sentences that we extracted from five corpora covering different genres. We selected three corpora from the currently available Arabic Treebanks from the Linguistic Data Consortium (LDC). These corpora were chosen because they are fully diacritized and had undergone significant quality control, which will allow us to evaluate the anno tation accuracy as well as our annotators understanding of the task. We select a total of 16,770 words from these corpora for annotation. Three native Arabic annotators with good linguistic background annotated the corpora samples. Diab et al. (2007), define six different diacritization schemes that are inspired by the observation of the relevant naturally occurring diacritics in different texts. We adopt the FULL diacritization scheme, in which all the diacritics should be specified in a word. Annotators were asked to fully diacritize each word.
The text genres were annotated following the different strategies:
- Basic: In this mode, we ask for annotation of words where all diacritics are absent, including the naturally occurring ones. The words are presented in a raw tokenized format to the annotators in context.
- Intermediate: In this mode, we provide the annotator with words along with their POS information. The intuition behind adding POS is to help the annotator disambiguate a word by narrowing down on the diacritization possibilities.
- Advanced: In this mode, the annotation task is formulated as a selection task instead of an editng task. Annotators are provided with a list of automatically diacritized candidates and are asked to choose the correct one, if it appears in the list. Otherwise, if they are not satisfied with the given candidates, they can manually edit the word and add the correct diacritics. This technique is designed in order to reduce annotation time and especially reduce annotator workload. For each word, we generate a list of vowelized candidates using MADAMIRA (Pasha et al., 2014). MADAMIRA is able to achieve a lemmatization accuracy 99.2% and a diacritization accuracy of 86.3%. We present the annotator with the top three candidates suggested by MADAMIRA, when possible. Otherwise, only the available candidates are provided.
We also provided annotators with detailed guidelines, describing our diacritization scheme and specifying how to add diacritics for each annotation strategy. We described the annotation procedure and specified how to deal with borderline cases. We also provided in the guidelines many annotated examples to illustrate the various rules and exceptions.
In order to determine the most optimized annotation setup for the annotators, in terms of speed and efficiency, we test the results obtained following the three annotation strategies. These annotations are all conducted for the FULL scheme. We first calculated the number of words annotated per hour, for each annotator and in each mode. As expected, following the Advanced mode, our three annotators could annotate an average of 618.93 words per hour which is double those annotated in the Basic mode (only 302.14 words). Adding POS tags to the Basic forms, as in the Intermediate mode, does not accelerate the process much. Only − 90 more words are diacritized per hour compared to the basic mode.
Then, we evaluated the Inter-Annotator Agreement (IAA) to quantify the extent to which independent annotators agree on the diacritics chosen for each word. For every text genre, two annotators were asked to annotate independently a sample of 100 words.
We measured the IAA between two annotators by averaging WER (Word Error Rate) over all pairs of words. The higher the WER between two annotations, the lower their agreement. The results obtained show clearly that the Advanced mode is the best strategy to adopt for this diacritization task. It is the less confusing method on all text genres (with WER between 1.56 and 5.58).
We also conducted a preliminary study for a minimum diacritization scheme. This is a diacritization scheme that encodes the most relevant differentiating diacritics to reduce confusability among words that look the same (homographs) when undiacritized but have different readings. Our hypothesis in MIN is that there is an optimal level of diacritization to render a text unambiguous for processing and enhance its readability. We showed the difficulty in defining such a scheme and how subjective this task can be.
Acknowledgement
This publication was made possible by grant NPRP-6-1020-1-199 from the Qatar National Research Fund (a member of the Qatar Foundation).
-
-
-
QUTor: QUIC-based Transport Architecture for Anonymous Communication Overlay Networks
Authors: Raik Aissaoui, Ochirkhand Erdene-Ochir, Mashael Al-Sabah and Aiman ErbadIn this new century, the growth of Information and Communication Technology (ICT) has a significant influence on our life. The wide spread of internet created an information society where the creation, distribution, use, integration and manipulation of information is a significant economic, political, and cultural activity. However, it has also brought its own set of challenges. Internet users have become increasingly vulnerable to online threats like botnets, Denial of Service (DoS) attacks and phishing spam mail. Stolen users’ information can be exploited by many third party entities. Some Internet Service Provider (ISP) sell this data to advertising companies which analyse it and build its marketing strategy to influence the customer choices by breaking their privacy. Oppressive governments exploit revealed users private data to harass members of the opposition parties, activist from civil society and journalists. Anonymity networks has been introduced in order to allow people to conceal their identity online. This is done by providing unlinkability between the user IP address, his digital fingerprint, and his online activities. Tor is the most widely used anonymity network today, serving millions of users on a daily basis using a growing number of volunteer-run routers [1]. Clients send their data to their destinations through a number of volunteer-operated proxies, known as Onion Routers (ORs). If a user wants to use the network to protect his online privacy, the user installs the Onion Proxy (OP), which bootstraps by contacting centralized servers, known as authoritative directories, to download the needed information about ORs serving in the network. Then, the OP builds overlay paths, known as circuits, which consist of three ORs-entry guard, middle and exit-where only the entry guard knows the user, and only the exit knows the destination. Tor helps internet users to hide their identities, however it introduces large and highly variable delays experienced in response and download times during web surfing activities that can be inconvenient for users. Traffic congestion adds further delays and variability to the performance of the network. Besides, an end-to-end flow control approach which does not react to congestion in the network.
To improve Tor performance, we propose to integrate QUIC for Tor. QUIC [2] (Quick UDP Internet Connections) is a new multiplexed and secure transport atop UDP, developed by Google. QUIC is implemented over UDP to solves a number of transport-layer and application-layer problems experienced by modern web applications. It reduces connection establishment latency. QUIC handshakes frequently require zero roundtrips before sending payload. It improves congestion control and multiplexes without head-of-line blocking. QUIC is designed for multiplexed streams, lost packets carrying data for an individual stream generally only impact that specific stream. In order to recover from lost packets without waiting for a retransmission, QUIC can complement a group of packets with an Forward Error Correction (FEC) packet. QUIC connections are identified by a 64-bit connection identification (ID). When a QUIC client changes Internet Protocol (IP) addresses, it can continue to use the old connection ID from the new IP address without interrupting any in-flight requests. QUIC provides multiplexing and flow control equivalent to HTTP/2, security equivalent to TLS, and connection semantics, reliability, and congestion control equivalent to TCP. QUIC shows a good performance against HTTP/1.1 [3]. We are expecting good results to improve the performance of Tor since QUIC is one of the most promising solutions to decrease latency [4]. A QUIC Stream is a bi-directional flow of bytes across a logical channel within a QUIC connection. This later is a conversation between two QUIC endpoints with a single encryption context that multiplexes streams within it. QUIC multiplestream architectures improves Tor performance and solves head-of-line problem. In first step, we implemented QUIC in OR nodes to be easily upgraded to the new architecture without modifying end user OP. Integrating QUIC will not degrade Tor security as it provides a security equivalent to TLS (QUIC Crypto) and soon it will use TLS 1.3.
-
-
-
Cognitive Dashboard for Teachers Professional Development
Authors: Riadh Besbes and Seifeddine BesbesIntroduction
This research aims to enhance the culture of data in education which is in the middle of a major transformation by technology and Big Data Analytics. The core purpose of schools is providing an excellent education to every learner; data can be the leverage of that mission. Big data analytics is the process of examining large sets containing a variety of data types to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Valuable lessons can be learnt from other industries when considered in terms of their practicality for public education. Hence, Big Data Analytics, also known as Education Data Mining and Learning Analytics, develop capacity for quantitative research in response to the growing need for evidence-based analysis related to education policy and practice. However, education has been slow to follow the Data Analytics evolution due to difficulties surrounding what data to collect, how to collect those data and what they might mean. Our research works identify, quantify, and measure the qualitative teaching practices, the learning performances, and track the learners' academic progress. Teaching and learning databases are accumulated from quantitative “measures” done through indoors classroom visits within academic institutions, online web access learners' questionnaires answers, paper written statements' analysis of academic exams in mathematics, science, and literacy disciplines, and online elementary grades seizure from written traces of learners' performances within mathematics, science, and literacy exams. The project's data mining strategy will support and develop teachers' expertise, enhance and scaffold students' learning, improve and raise education system's performance. The supervisor expertise will mentor the researcher for information and educational knowledge extraction from collected data. As consequence, the researcher will acquire the wisdom of knowledge use to translate it into more effective training sessions on educational concrete policies.
State-of-the-art
Anne Jorro says: “evaluate is necessarily considering how we will support, advise, exchange, to give recognition to encourage the involvement of the actor giving him the means to act”. PISA report states that many of the world's best-performing education systems have moved from bureaucratic “command and control” environments towards school systems in which the people at the frontline have much more control. Making teaching and learning data available leads to information then knowledge extraction. As advised by PISA report, the effective use of extracted knowledge drives decision making to wisdom. Linda Darling-Hammond and Charles E. Ducommun underscore the important assumption that, undoubtedly, teachers are the fulcrum that has the biggest impact and makes any school initiative leads toward success or failure. Rivkin and al ensure that a teacher's classroom instructional practice is perhaps one of the most important yet least understood factors contributing to teacher effectiveness. As consequence, many classroom observation tools are designed to demystify effective teaching practices. The Classroom Assessment Scoring System is a well-respected classroom climate observational system. The CLASS examines three domains of behaviour including, firstly, emotional support (positive classroom climate, teacher sensitivity, and regard for student perspectives). Secondly, it includes classroom organization (effective behaviour management, productivity, and instructional learning formats). Thirdly, it contains instructional supports (concept development, quality of feedback, and language modelling). The Framework for Teaching Method for Evaluation Classroom Observation is lastly releasing as 2013 edition. It divides the complex activity of teaching into 22 components clustered into four domains of teaching responsibility. This last tool's edition was conceived to respond to the instructional implications of the American Common Core State Standards. Those standards envision, for literacy and mathematics initially, deep engagement by students with important concepts, skills, and perspectives. They emphasize active, rather than passive, learning by students. In all areas, they place a premium on deep conceptual understanding, thinking and reasoning, and the skill of argumentation. Heather Hill from Harvard University, and Deborah Loewenberg Ball from university of Michigan, had developed “Mathematical Quality of Instruction (MQI)”. Irving Hamer is an education consultant and a former deputy superintendent for academics, technology, and innovation for school system.
Objectives
Our project's wider objective is to improve teaching and learning effectiveness within K12 classes by exploiting data mining methods for educational knowledge extraction. The researcher realizes three daily visits to mathematics, science, and literacy courses. Using his interactive educational grid, an average of 250 numerical data were stored as quantified teaching and learning practices for one classroom visit for every teacher. At the same time through, and in parallel with on-field activities, distance interactivity via website is processed. At the beginning and for once, each learner from planned classes to be visited fills an individual questionnaire form for learning style identification. He seizes, on another website form, every elementary grade on each question from his maths, science and literacy exams' answer sheets. Those exams statements were previously analysed and saved by the researcher on website. Averages of 150 numerical data were stored as quantified learning performances for every learner. Meetings at partner University for data analytics and educational knowledge extraction were done followed by meetings at inspectorate headquarters for in-depth data. Then, in partner schools, training sessions were the theatres of constructive reflections and feedbacks on major findings about teaching and learning effectiveness. Those actions were reiterated for months. Each year, averagely the performance of 1000 students and the educational practices of 120 teachers will be specifically and tracked. Within summer's months, workshops, seminars, and an international conference will be organised for stakeholders from educational fields. Thus, among project's actions three specific objectives shall be achieved. First, sufficient data on students' profiles and performances related to educational weaknesses and strengths will be provided. Second, teachers' practices inside classrooms at each partner school will be statistically recorded. Third, a complete data mining centre for educational research will be conceived and cognitively interpreted by researchers' teams then findings are exposed for teachers' reflexive thoughts, and discussions within meetings and training sessions
Research methodology and approach
-
-
-
Dynamic Scheduled Access Medium Access Control for Emerging Wearable Applications
Authors: Muhammad Mahtab Alam, Dhafer Ben-Arbia and Elyes Ben-HamidaContext and Motivation
Wearable technology is emerging as one of the key enablers for the internet of everything (IoE). The technology is getting mature by every day with more applications than ever before consequently making a significant impact in consumer electronic industry. In recent years, with the continuous exponential rise, it is anticipated that by 2019 there will be more than 150 million wearable devices worldwide [1]. Whilst fitness and health-care remain the dominant wearable applications, other applications include fashion and entertainment, augmented reality, rescue and emergency management are emerging as well [2]. In this context, Wireless Body Area Networks (WBAN) is implicit and well-known research discipline which foster and contribute towards the rapid growth of wearable technology. IEEE 802.15.6 standard targeted for WBAN provides a great flexibility and provisions both at the physical (PHY) and medium access control (MAC) layers [3].
The wearable devices are constraint by limited battery, miniaturized, low processing and storage capabilities. While energy efficiency remains one of the most important challenges, low duty cycle and dynamic MAC layer design is critical for the longer life of these devices. In this regard, scheduled access mechanism is considered as one of the effective MAC approaches in WBAN in which every sensor node can have a dedicated time slot to transfer its data to the BAN coordinator. However, for a given application, as every node (i.e., connected sensors) has different data transmission rates [4], therefore, the scheduled access mechanism has to adapt the slot allocation accordingly to meet the design constraints (i.e., energy efficiency, packet delivery and delay requirements).
Problem Description
The scheduled access MAC with 2.4 GHz of operating frequency, highest data rate (i.e., 971 Kbps), and highest payload (i.e., 256 bytes) provides the maximum throughput in IEEE 802.15.6 standard. However, the performance of both packet delivery ratio (PDR) and delay in this configuration is very poor starting from -10dBm and lower transmission power [5]. The presented study is focused on this particular PHY-MAC configuration and to understand what is the maximum realistic achievable throughput while operating at the lowest transmission power for future IEEE 802.15.6 compliant transceivers. In addition the objective is to enhance the performance under realistic mobility patterns i.e., space and time varying channel conditions.
Contribution
In this paper we address the reliability concern of the above mentioned wearable applications while using IEEE 802.15.6 (high data rate supported) PHY-MAC configuration. The objective is to enhance the system performance while exploiting m-periodic scheduled access mechanism. We proposed a throughput and channel aware dynamic scheduling algorithm which provides a realistic throughput under dynamic mobility and space and time varying links. First, various mobility patterns are generated with special emphasis on space and time varying links because their performance is most vulnerable under the dynamic environment. A deterministic pathloss values (as an estimate of the channel) are obtained from a motion capture system and bio-mechanical modeling. Consequently, signal to noise (SNR), bit error rate (BER) and packet error rate (PER) are calculated. The proposed algorithm during the first phase uses this estimated PER to select the potential nodes for a time slot. Whereas in the second phase, based on the nodes priority and the data packets availability among the potential candidates, finally a slot is assigned to one node. This process is iterated by the coordinating node until the end of a super frame.
Results
The proposed scheduling scheme has a significant gain over a reference scheme (i.e., without dynamic adaptation). On average, 20-to-55 percent extra packets are received, along with 1-to-5 joules of energy savings though at the cost of higher delay ranging from 20-to-200 ms while operating at low power levels (i.e., 0 dBm, -5 dBm, -10 dBm). It is recommended that the future wearable IEEE 802.15.6 compliant transceivers can successfully operate at -5 dBm to -8 dBm of transmission power; further reducing the power levels under dynamic environment can degrade the performance. It is also observed that the achievable throughput of different time varying links is good under realistic conditions until the data packet generation rate is higher than 100 ms. Acknowledgment: The work was supported by NPRP grant #[6-1508-2-616] from the Qatar National Research Fund which is a member of Qatar Foundation. The statements made herein are solely the responsibility of the authors.
References
[1] “Facts and statistics on Wearable Technology,” 2015. [Online]. Available: http://www.statista.com/topics/1556/wearable-technology/.
[2] M. M. Alam and E. B. Hamida, “Surveying Wearable Human Assistive Technology for Life and Safety Critical Applications: Standards, Challenges and Opportunities,” MDPI Journal on Sensors, vol. 14, no. 5, pp. 9153–9209, 2014.
[3] “802.15.6-2012 - IEEE Standard for Local and metropolitan area networks - Part 15.6: Wireless Body Area Networks,” 2012. [Online]. Available: https://standards.ieee.org/findstds/standard/802.15.6-2012.html.
[4] M. M. Alam and E. B. Hamida, “Strategies for Optimal MAC Parameters Tuning in IEEE $802.15.6$ Wearable Wireless Sensor Networks,” Journal of Medical Systems, vol. 39, no. 9, pp. 1–16, 2015.
[5] M. Alam and E. BenHamida, “Performance evaluation of IEEE 802.15.6 MAC for WBSN using a space-time dependent radio link model,” in IEEE 11th AICCSA Conference, Doha, 2014.
-
-
-
Real-Time Location Extraction for Social-Media Events in Qatar
1.Introduction
Social media gives us instant access to a continuous stream of information generated by users around the world. This enables real-time monitoring of users’ behavior (Abbar et al., 2015), events’ life-cycles (Weng and Lee. 2010), and large-scale analysis of human interactions in general. Social media platforms are also used to propagate influence, spread content, and share information about events happening in real-time. Detecting the location of events directly from user-generated text can be useful in different contexts, such as humanitarian response, detecting the spread of diseases, or monitoring traffic. In this abstract, we define a system that can be used for any of the purposes described above, and illustrate its usefulness with an application for locating traffic-related events (e.g., traffic jams) in Doha.
The goal of this project is to design a system that, given a social-media post describing an event, predicts whether or not the event belongs to a specific category (e.g., traffic accidents) within a specific location (e.g. Doha). If the post is found to belong to the target category, the system proceeds with the detection of all possible mentions of locations (e.g. “Corniche”, “Sports R/A”, “Al Luqta Street.”, etc.), landmarks (“City Center”, “New Al-Rayyan gas station”, etc.), and location expressions (e.g. “On the Corniche between the MIA park and the Souq)”. Finally, the system geo-localizes (i.e. assigns latitude and longitude coordinates) to every location expression used in the description of the event. This makes it useful for placing the different events onto a map; a downstream application will use these coordinates to monitor real-time traffic, and geo-localize traffic-related incidents.
2.System Architecture
In this section we present an overview of our system. We first describe its general “modular” architecture, and then proceed with the description of each module.
2.1. General view
The general view of the system is depicted in Figure 1. The journey starts by listening to some social media platforms (e.g., Twitter, Instagram) to catch relevant social posts (e.g., tweets, check-ins) using a list of handcrafted keywords related to the context of the system (e.g., road traffic). Relevant posts are then pushed through a three-steps pipeline in which we double-check the relevance of the post using an advanced binary classifier (Content Filter). We then extract location names mentioned in the posts if any. Next, we geo-locate the identified locations to their accurate placement on the map. This process allow to filter undesirable posts, and augment the relevant once with precise geo-location coordinates which are finally exposed for consumption via a restful API. We provide below details on each of the aforementioned modules.
Figure 1: Data processing pipeline.
2.2. Content filter
The Content Filter consists of a binary classifier that, given a tweet deemed to be about Doha, decides whether the tweet is a real-time report about traffic in Doha or not. The classifier receives as input tweets that have been tweeted from a location enclosed in a geographic rectangle (or bounding box) that roughly corresponds to Doha, and that contain one or more keywords expected to refer to traffic-related events (e.g., “accident”, “traffic”, “jam”, etc.). The classifier is expected to filter out those tweets that are not real-time reports about traffic (e.g., tweets that mention “jam”’ as a type of food, tweets that complain about the traffic in general, etc.). We build the classifier using supervised learning technology; in other words, a generic learning process learns, from a set of tweets that have been manually marked as being either real-time reports about traffic or not, the characteristics that a new tweet should have in order to be considered a real-time report about traffic. For our project, 1000 tweets have been manually marked for training purposes. When deciding about a new tweet, the classifier looks for “cues” that, in the training phase, have been found to be “discriminative”, i.e., helpful in taking the classification decision. In our project, we used the Stanford Maximum Entropy Classifier (Manning and Klein, 2003) to perform the discriminative training. In order to generate candidate cues, the tweet is preprocessed via a pipeline of natural language analysis tools, including a social-media-specific tokenizer (O'Connor et al., 2010) which splits words, and a rule-based Named-Entity Simplifier which substitutes mentions of local entities by their corresponding meta-categories (for example, it substitutes “@moi_qatar” or “@ashghal” for “government_entity”).
2.3.NLP components
The Location Expression Extractor is a module that identifies (or extracts) location expressions, i.e., natural language expressions that denote locations (e.g., “@ the Slope roundabout”, “right in front of the Lulu Hypermarket”, “on Khalifa”, “at the crossroads of Khalifa and Majlis Al Taawon”, etc.). A location expression can be a complex linguistic object, e.g., “on the Corniche between the MIA and the underpass to the airport”. A key component of the Location Expression Extractor is the Location Named Entity Extractor, i.e., a module that identifies named entities of Location type (e.g. “the Slope roundabout”) or Landmark type (e.g., “the MIA”). For our purposes, a location is any proper name in the Doha street system (e.g., “Corniche”, “TV roundabout”, “Khalifa”, “Khalifa Street”); landmarks are different from locations, since the locations are only functional to the Doha street system, while landmarks have a different purpose (e.g., the MIA is primarily a museum, although its whereabouts may be used as a proxy of a specific location in the Doha street system – i.e., the portion of the Corniche that is right in front of it).
The Location Named Entity Extractor receives as input the set of tweets that have been deemed to be about some traffic-related event in Doha, and returns the same tweet where named entities of type Location or of type Landmark have been marked as such. We generate a Location Named Entity Extractor via (again) supervised learning technology. In our system, we used the Stanford CRF-based Named Entity Recognizer (Finkel et al., 2005) to recognize named entities of type Location or of type Landmark using a set of tweets where such named entities have been manually marked. From these “training” tweets the learning system automatically recognizes the characteristics that a natural language expression should have in order to be considered a named entity of type Location or of type Landmark. Again, the learning system looks for “discriminative” cues, i.e., features in the text that may indicate the presence of one of the sought named entities. To improve the accuracy over tweets, we used a tweet-specific tokenizer (O'Connor et al., 2010), a tweet-specific Part-of-Speech tagger (Owoputi et al., 2013) and an in-house gazetteer of locations related to Qatar.
2.4.Resolving location expression onto the map
Once location entities are extracted using the NLP components, we use the APIs of Google, Bing and Nominatim to request the geographic coordinates of the map location entities into geographic coordinates. Each location entity is geo-coded by the Google Geolocation API, Bing Maps REST API and Nomination gazetteer individually. We use multiple geo-coding sources to increase the robustness of our application, as a single API might fail to retrieve geo-coding data. Given a location entity, the result of the geo-coding retrieval is formatted as a JSON object containing the name of the location entity, its address, and the corresponding geo-coding results from Bing, Google or Nominatim. The geo-coding process is validated by comparing the results of the different services used. We first make sure that the location returned falls within Qatar's bounding box. We then compute the pairwise distance between the different geographic coordinates to ensure their consistency.
2.5.Description of the Restful API
In order to ease the consumption of the relevant geo-located posts and make it possible to integrate these posts in a comprehensive way with other platforms, we have built a Restful API. In the context of our system, this refers to using HTTP verbs (GET, POST, PUT) to retrieve relevant social posts stored by our back-end processing.
Our API exposes two endpoints: Recent and Search. The former endpoint provides an interface to request the latest posts identified by our system. It supports two parameters: Count (maximum number of posts to return) and Language (the language of posts to return i.e., English or Arabic.) The later endpoint enables querying the posts for specific keywords and return only posts matching them. This endpoint supports three parameters: Query (list of keywords), Since (date-time of the oldest post to retrieve), From-To (two date-time parameters to express the time interval of interest.) In the case of a road traffic application, one could request tweets about “accidents” that occurred in West-Bay since the 10th of October.
3.Target: single architecture for multiple applications
Our proposed platform is highly modular (see Figure 1). This guarantees that relatively simple changes in some modules can make the platform relevant to any applicative context where locating user messages on a map is required. For instance, the content classifier – the first filtering element in the pipeline – can be oriented to mobility problems in a city: accident or congestion reporting, road blocking or construction sites, etc. With the suitable classifier, our platform will collect traffic and mobility tweets, and geo-locate them when possible. However, there are many other contexts in which precise location is needed. For instance, in natural disaster management, it is well admitted that people involved in catastrophic events (floods, typhoons, etc.) use social media as a means to create awareness, demand help or medical attention (Imran et al., 2013). Quite often, these messages may contain critical information for relief forces, who may not have enough knowledge of the affected place and/or accurate information of the level of damage in buildings or roads. Often, the task to read, locate on a map and mark is crowd-sourced to volunteers; we foresee that, in such time-constrained situations, our proposed technology would represent an advancement. Likewise, the system may be oriented towards other applications: weather conditions, leisure, etc.
4.System Instantiation
We have instantiated the proposed platform to the problem of road traffic in Doha. Our objective is to sense in real-time the traffic status in the city using social media posts only. Figure 2 shows three widgets of the implemented system. First, the Geo-mapped Tweets Widget shows a Doha map with different markers: the yellow markers symbolize the tweets geo-located by the users, the red markers represent the tweets geo-located by our system; the large markers come from tweets that have an attached photo, while the small markers represent the text-only tweets. Second, the Popular Hashtags Widget illustrates hashtags mentioned by the users, where the large font size shows the most frequent one. Third, the Tweets Widget lists the traffic-related tweets which are collected by our system.
Figure 2: Snapshot of some System's frontend widgets.
5.References
-
-
-
Sentiment Analysis in Comments Associated to News Articles: Application to Al Jazeera Comments
Authors: Khalid Al-Kubaisi, Abdelaali Hassaine and Ali JaouaSentiment analysis is a very important research task that aims at understanding the general sentiment of a specific community or group of people. Sentiment analysis of Arabic content is still in its early development stages. In the scope of Islamic content mining, sentiment analysis helps understanding what topics Muslims around the world are discussing, which topics are trending and also which topics will be trending in the future.
This study has been conducted on a dataset of 5000 comments on news articles collected from Al Jazeera Arabic website. All articles were about the recent war against the Islamic State. The database has been annotated using Crowdflower which is website for crowdsourcing annotations of datasets. Users manually selected whether the sentiment associated with the comment was positive or negative or neutral. Each comment has been annotated by four different users and each annotation is associated with a confidence level between 0 and 1. The confidence level corresponds to whether the users who annotated the same comment agreed or not (1 corresponds to full agreement between the four annotators and 0 to full disagreement).
Our method represents the corpus by a binary relation between the set of comments (x) and the set of words (y). A relation exists between the comment (x) and the word (y) if, and only if, (x) contains (y). Three binary relations are created for comments associated with positive, negative and neutral sentiments. Our method then extracts keywords from the obtained binary relations using the hyper concept method [1]. This method decomposes the original relation into non-overlapping rectangles and highlights for each rectangle the most representative keyword. The output is a list of keywords sorted in a hierarchical ordering of importance. The obtained keyword list associated with positive, negative and neutral comments are fed into a random forest classifier of 1000 random trees in order to predict the sentiment associated with each comment of the test set.
Experiments have been conducted after splitting the database into 70% training and 30% testing subsets. Our method achieves a correct classification rate of 71% when considering annotations with all values of confidence and even 89% when only considering the annotation with a confidence value equal to 1. These results are very promising and testify of the relevance of the extracted keywords.
In conclusion, the hyper concept method extracts discriminative keywords which are used in order to successfully distinguish between comments containing positive, negative and neutral sentiments. Future work includes performing further experiments by using a varying threshold level for the confidence value. Moreover, by applying a part of speech tagger, it is planned to perform keyword extraction on words corresponding to specific grammatical roles (adjectives, verbs, nouns… etc.). Finally, it is also planned to test this method on publicly available datasets such as the Rotten Tomatoes Movie Reviews dataset [2].
Acknowledgment
This contribution was made possible by NPRP grant #06-1220-1-233 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
References
[1] A. Hassaine, S. Mecheter, and A. Jaoua. “Text Categorization Using Hyper Rectangular Keyword Extraction: Application to News Articles Classification.” Relational and Algebraic Methods in Computer Science. Springer International Publishing, 2015. 312–325.
[2] B. Pang and L. Lee. 2005. “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales”. In ACL, pages 115–124.
-
-
-
Flight Scheduling in the Airspace
Authors: Mohamed Kais Msakni, Mohamed Kharbeche, Mohammed Al-Salem and Abdelmagid HammudaThis paper addresses an important problem in the aircraft traffic management caused by the rapid growth of air traffic. The air route traffic control center has to deal with different plans of airlines in which they specify a requested entry time of their aircraft to the airspace. Each flight has to be assigned to a track and a level in order to ensure the Federal Aviation Administration (FAA) safety standards. When two flights are assigned to the same track and level, a minimum separation time has to be ensured. If this condition could not be satisfied, one of the flight will be delayed. This solution is undesirable for many reasons such as missing the connecting flight, decrease in the passengers' satisfaction, etc.
The problem of track-level scheduling can be defined as follows. Given a set of flights, each flight has to be assigned to one track and one level. To ensure the separation time between two flights assigned to the same track and level, it is possible to delay the requested departure time of a flight. The objective is to minimize the overall flight delay.
To deal with this problem, we propose a mixed integer programming formulation to find a flight plan that minimizes the objective function, while ensuring the FAA safety standards. In particular, this model considers an aircraft-dependent separation time: the separation time depends on the type of the aircraft assigned to the same track and level. However, some problems are too large to be solved in a reasonable time with the proposed model using a commercial solver. In this study, we developed a scatter search (SS) to deal with larger instances. SS is an evolutionary heuristic and the feature to be a problem-independent structure. This metaheuristic has been efficiently applied to a variety of optimization problems. Initially, SS starts with a set of solutions (reference set) that is constantly updated through two procedures (solution generation and combination) in the aim to produce high-quality solutions.
In order to assess the quality of the exact method and the scatter search, we carried out an experimental study on a set of instances that are generated from a real case data. This includes small (80 to 120 flights), medium (200 to 220 flights), and large (400 to 420 flights) instances. The mathematical model has been solved using CPLEX 12.6 and the scatter search has been coded using C language under Microsoft Visual Studio v12 environment. The tests were conducted under a Windows 7 machine with an Intel Core i7 and 8 GB of RAM. The model was tested on each instance with 1 hour time limit. The results show that no instances have been solved to optimality. For small instances, the model and the scatter search provide comparable results; however, for medium and large instances, scatter search gives the best results.
This conference was made possible by the UREP award [UREP 13 - 025 - 2 - 010] from the Qatar National Research Fund (a member of The Qatar Foundation).
-
-
-
Named Entity Disambiguation using Hierarchical Text Categorization
Authors: Abdelaali Hassaine, Jameela Al Otaibi and Ali JaouaNamed entity extraction is an important step in natural language processing. It aims at finding the entities which are present in text such as organizations, places or persons. Named entities extraction is of a paramount importance when it comes to automatic translation as different named entities are translated differently. Named entities are also very useful for advanced search engines which aim at searching for a detailed information regarding a specific entity. Named entity extraction is a difficult problem as it usually requires a disambiguation step as the same word might belong to different named entities depending on the context.
This work has been conducted on the ANERCorp named entities database. This Arabic database contains four different named entities: person, organization, location and miscellaneous. The database contains 6099 sentences, out of which 60% are used for training 20% for validation and 20% for testing.
Our method for named entity extraction contains two main steps: the first step predicts the list of named entities which are present at the sentence level. The second step predicts the named entity of each word of the sentence.
The prediction of the list of named entities at the sentence level is done through separating the document into sentences using punctuation marks. Subsequently, a binary relation between the set of sentences (x) and the set of words (y) is created from the obtained list of sentences. A relation exists between the sentence (x) and the word (y) if, and only if, (x) contains (y). A binary relation is created for each category of named entities (person, organization, location and miscellaneous). If a sentence contains several named entities, it is duplicated in the relation corresponding to each one of them. Our method then extracts keywords from the obtained binary relations using the hyper concept method [1]. This method decomposes the original relation into non-overlapping rectangles and highlights for each rectangle the most representative keyword. The output is a list of keywords sorted in a hierarchical ordering of importance. The obtained keyword list associated with each category of named entities are fed into a random forest classifier of 10000 random trees in order to predict the list of named entities associated with each sentence. The random forest classifier produces for each sentence the list of probabilities corresponding to the existence of each category of named entities within the sentence.
Random Forest [sentence(i)] = (P(Person),P(Organization),P(Location),P(miscellaneous)).
Subsequently, the sentence is associated with the named entities for which the corresponding probability is larger than a threshold set empirically on the validation set.
In the second step, we create a lookup table associating to each word in the database, the list of named entities to which it corresponds in the training set.
For unseen sentences of the test set, the list of named entities predicted at the sentence level is produced, and for each word, the list of predicted named entities is also produced using the lookup table previously built. Ultimately, for each word, the intersection between the two predicted lists of named entities (at the sentence and the word level) will give the final predicted named entity. In the case where more than one named entity is produced at this stage, the one with the maximum probability is kept.
We obtained an accuracy of 76.58% when only considering lookup tables of named entities produced at the word level. When performing the intersection with the list produced at the sentence level the accuracy reaches 77.96%.
In conclusion, the hierarchical named entity extraction leads to improved results over direct extraction. Future work includes the use of other linguist features and larger lookup table in order to improve the results. Validation on other state of the art databases is also considered.
Acknowledgements
This contribution was made possible by NPRP grant #06-1220-1-233 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
Reference
[1] A. Hassaine, S. Mecheter, and A. Jaoua. “Text Categorization Using Hyper Rectangular Keyword Extraction: Application to News Articles Classification”. Relational and Algebraic Methods in Computer Science. Springer International Publishing, 2015. 312–325.
-
-
-
SWIPT MIMO Relaying in with Spectrum Sharing Networks with Interference Cancellation
More LessSimultaneous wireless information and power transfer (SWIPT) is a promising solution to increase the lifetime of wireless nodes and hence alleviate the energy bottleneck of energy constrained wireless networks. To recent days, there are three different designs of SWIPT system which includes integrated SWIPT, closed-loop SWIPT, decoupled SWIPT. Integrated SWIPT is the simplest design where power and information are extracted by the mobile from the same modulated microwave transmitted by a base station (BS). For this scheme, the information transfer (IT) and power transfer (PT) distances are equal. For the closed loop scenario, it splits IT and PT between uplink and downlink wherein PT is in downlink and IT is for dedicated for uplink. The last one is to add additional special base station (BS) called a power beacon PB in which PT and IT are orthogonalized by using different frequency bands or time slots to avoid interference. Therefore, powering a cognitive radio networks through RF energy harvesting can be efficient in terms of spectrum usage and energy limits for wireless networking. The RF energy harvesting technique also is applicable in cooperative networks wherein an energy constrained relay with limited battery depends on external charging mechanism to assist the transmission of source information to the destination. In an effort to further improve spectrum sharing network performance, a number of works has suggested the idea of incorporating the multiple antenna technique into cognitive relaying. In particular, transmit antenna selection with receive maximal ratio combining (TAS/MRC) is adopted as a low complexity and power efficient approach which achieves full transmit/receive diversity.
Since the SUs and PUs share the same frequency band, there will be inevitably interference between the SUs and PUs. Therefore, reducing the effect of PU interference on the performance of secondary receiver is of significance important. Consequently, smart antennas can be employed to mitigate the PU interference. With knowledge of the direction of arrival (DoA), the receive radiation pattern can be shaped to place deep nulls in the directions of some of interfering signals. By doing so, two null-steering algorithms were proposed in the literature i.e., dominant interference reduction algorithm, and adaptive arbitrary interference reduction algorithm. The first algorithm requires perfect predication and statistical ordering of the interference signals instantaneous power, and the later algorithm does not need prior knowledge of the statistical properties of interfering signals. In this work, we limit our analysis to the dominant interference reduction algorithm.
In this work, we consider a dual-hop relaying with amplify-and-forward (AF) scheme where the source, relay, and the destination are equipped with multiple antennas. The relay node is experiencing co-channel interference. The purpose of array processing at the relay is to provide interference cancellation. Therefore, the energy constrained relay collects energy from ambient RF signals, cancel CCI, and then forward the information to the destination. In particular, we provide a comprehensive analysis for the system assuming selection at the source, and the destination. We derive the end-to-end exact and asymptotic outage probability for the proposed system model. A key parameters are also obtained featuring the diversity and coding gains.
-
-
-
Action Recognition in Spectator Crowds
Authors: Arif Mahmood and Nasir RajpootAction Recognition in Spectator Crowds
During the Football Association competitions held in 2013 in UK, 2,273 people were arrested due to the events of lawlessness and disorder, according to the statistics collected by the UK Home Office [1]. According to a survey on the major soccer stadium disasters around the world, more than 1500 people have died and more than 5000 are injured since 1902 to 2012 [2]. Therefore understanding spectator crowd behaviour is an important problem for public safety management and for the prevention of dangerous activities.
Computer Vision is the platform used by researchers for efficient crowd management research through video cameras. However most of the research efforts primarily show results on protest crowds or casual crowds while the spectator crowds have not been focussed. On the other hand the action recognition research has mostly addressed actions performed by one or two actors while the actions performed by individuals in the dense spectator crowds has not been addressed and is still an unsolved problem.
Action recognition in dense crowds pose very difficult challenges mostly due to the low resolution of subjects and significant variations in the action performance by the same individuals. Also different individuals perform the same action quite differently. Spatial distribution of performers varies with time. Scene contains multiple actions at the same time. Thus compared to the single actor action recognition, noise and outliers are significantly large and action start and stop are not well defined making action recognition very difficult.
In this work we target to recognize the actions performed by individuals in spectator crowds. For this purpose we consider a recently released dataset consisting of spectators in the 26th Winter Universiade held in Italy in 2013 [3]. Data was collected during the last four matches held in the same ice stadium using 5 cameras. Three high resolution cameras focussed on different parts of the spectator crowd with 1280 × 1024 pixel resolution and 30 fps temporal resolution. Figure 1 shows an example spectator crowd dataset image.
For action recognition in the spectator crowds, we purpose to compute dense trajectories in the crowd videos by using optical flow [4]. Trajectories are initiated on a dense grid and the starting points satisfy a quality measure based on KLT feature tracker (Fig. 2). Trajectories exhibiting motion lower than a minimum threshold are discarded. Along each trajectory shape and texture is encoded using Histograms of Oriented Gradients (HOG) features [5] and motion is encoded using Histogram of Flow (HOF) features [6]. The resulting feature vectors are grouped using the person bounding boxes provided in the dataset (Fig. 4). Note that person detectors which are especially designed for detection and segmentation of persons in dense crowds can also be used for this purpose [7].
All trajectories corresponding to a particular person are considered to encode the actions performed by that person. These trajectories are divided into overlapped temporal windows of width 30 frames (or 1.00 second time). Two consecutive windows has an overlap of 66%. Each person-time window is encoded using bag-of-words technique as explained below.
The S-HOCK dataset contains 15 videos of spectator crowds. For the purpose of training we use 10 videos and the remaining 5 videos are used for testing. From the training videos 100,000 trajectories are randomly sampled and grouped to 64 clusters using k-means algorithm. Each cluster center is considered as an item in the code-book. Each trajectory in a person-time group of trajectories is encoded using this code-book. This encoding is performed in the training as well as the test videos using bag-of-words approach. The code-book is considered as a part of the training process and saved.
For the purpose of bag-of-words encoding, distance of each trajectory in the person-time trajectory group is measured from all items in the code-book. Here we follow two approaches. In the first approach, only one vote is casted at the index corresponding to the best matching code-book item. In the second approach, 5 votes are casted corresponding to the 5 best matching code-book items. These votes are given weights inversely proportional to the distance of trajectory from each of the five best matching code-book items.
In our experiments we observe better action recognition performance of the multi-voting strategy compared to the single weight scheme. It is because more information is captures in the multi-voting strategy. In the SHOCH dataset, each person is manually labelled as performing one of the 23 actions, including the ‘other’ action which covers all actions not included in the first 22 categories (Fig. 3). Each person-time group of trajectories is given an action label from the dataset. Once this group is encoded using code-book, it becomes a single vector histogram. Each of these vectors is given the same action label depending upon the label assigned to the corresponding person-time trajectory group.
The labelled vectors obtained from the training dataset are used to train both linear and kernel SVM using one verses all strategy. The labels of the vectors in the test data are used as ground truth and the learned SVM are used to predict the label of each test vector independently. The predicted labels are then compared with the ground truth labels to establish action recognition accuracy. We observe an accuracy increase of 3% to 4% when SVM with Gaussian RBF was used. Results are shown in Table 1 and precision recall curves are shown in Figs. 5 & 6.
In our experiments we observe that applauding and shaking flag actions have obtained more accuracy compared with other actions in the dataset (Table 1). It is mainly because of the fact that these actions have higher frequency and consist of significant discriminative motion. While other actions have low frequency of occurrence and also in some actions the motion is not discriminative. For example in using device action, when someone in the crowd use a mobile phone or a camera, the motion based detection is not very efficient.
References
[1]Home Office and The Rt Hon Mike Penning MP, “Football-related arrests and banning orders, season 2013 to 2014”, published 11 September 2014.
[2]Associated Press, “Major Soccer Stadium Disasters”, The Wall Street Journal (World), published 1 February 2012.
[3]Conigliaro, Davide, et al. “The SHock Dataset: Analyzing Crowds at the Stadium.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[4]Wang, Heng, et al. “Action recognition by dense trajectories.” Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
[5]Dalal, Navneet, and Bill Triggs. “Histograms of oriented gradients for human detection.” Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.
[6]Dalal, Navneet, Bill Triggs, and Cordelia Schmid. “Human detection using oriented histograms of flow and appearance.” Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 428–441.
[7]Idrees, Haroon, Khurram Soomro, and Mubarak Shah. “Detecting Humans in Dense Crowds using Locally-Consistent Scale Prior and Global Occlusion Reasoning.” IEEE TPAMI 2015.
-
-
-
Plasmonic Modulator Based on Fano Resonance
The field of plasmonics is continuously attracting research in the area of integrated photonics development, to highly integrate photonic components, devices and detectors in a single photonic chip, just as electronic chips containing many electronic components. The interesting properties that plasmonics offer include the electromagnetic fields enhancement and the confinement of the propagating surface plasmon polaritons to sub-100 nm size features at metal dielectric interfaces. Thereby, the field of plasmonics is very promising for minimizing the photonic components to smaller sizes that cannot be experienced using conventional optics and in particular the Silicon photonics industry. Many applications based on plasmonics are being increasingly developed and studied such as electromagnetic field enhancement for surface enhanced spectroscopy, wave guiding, sensing, modulation, switching, and photovoltaic applications.
We hereby propose a novel compact plasmonic resonator that can be utilized for different applications that depend on optical resonance phenomena in the Near Infrared spectral range, a very interesting range for a variety of applications, including sensing and modulation. The resonator structure consists of a gold layer which is etched to form a metal-insulator-metal wave guide and a rectangular cavity. The rectangular cavity and the wave guide are initially treated as a dielectric material. The strong reflectivity of gold at frequencies higher than the plasma frequency is the origin of the Fabry Perot resonator behavior of the rectangular cavity. The fano resonance was produced successfully and controlled by varying the rectangular cavity dimensions. The fano profile is generated in as the result of the redistribution of the electromagnetic field in the rectangular cavity as depicted by the plasmonic mode distribution in the resonator. The fano resonance is characterized by its sharp spectral line which attracts applications requiring sharp spectral line shapes such as sensing and modulation applications.
Optical modulators are key components in the modern communication technology. The research trend on optical modulators aims at achieving compact designs, low power consumption and large bandwidth operation. Plasmonic modulators emerge as promising devices since they can have high modulation speeds and very compact designs.
The operation mechanism of our introduced plasmonic modulator is as follows: instead of a constant refractive index dielectric, an electro-optic polymer that has its refractive index is dependent on some controlled phenomena, is filled in the metal insulator metal waveguide and the rectangular cavity. Efficient modulation was achieved by changing the applied voltage (DC Signal) on the metal contacts which then changes the refractive index of the polymer, thereby shifting the resonant wavelength position in the resonator, leading to signal modulation. Our modulator is operational at the telecom wavelength 1.55 μm, thereby suitable for the modern communication technology.
Finite Difference Time Domain (FDTD) simulations were conducted to design the modulator structure and run the simulations experiments, and to study the resonance effects of the structure and optimize its response to the desired results, the most important results however are the efficient modulation of the optical energy at the wavelengths required in the modern communication technology, around 1.5 μm, all results were carried on using the commercially available Lumerical FDTD software.
-
-
-
Conceptual-based Functional Dependency Detection Framework
By Fahad IslamNowadays, knowledge discovery from data is one of the challenging problems, due to its importance in different fields such as; biology, economy and social sciences. One way of extracting knowledge from data can be achieved by discovering functional dependencies (FDs). FD explores the relation between different attributes, so that the value of one or more attributes is determined by another attribute set [1]. FD discovery helps in many applications, such as; query optimization, data normalization, interface restructuring, and data cleaning. A plethora of functional dependency discovery algorithms has been proposed. Some of the most widely used algorithms are; TANE [2], FD_MINE [3], FUN [4], DFD [5], DEP-MINER [6], FASTFDS [7] and FDEP [8]. These algorithms extract FDs using different techniques, such as; (1) building a search space of all attributes combinations in an ordered manner, then start searching for candidate attributes that are assumed to have functional dependency between them, (2) generating agreeing and difference sets, where the agreeing sets are acquired through applying cross product of all tuples, the difference sets are the complement of the agreeing sets, both sets are used to infer the dependencies, (3) generating one generic set of functional dependency, in which each attribute can determine all other attributes, this set is then updated and some dependencies are removed to include more specialized dependencies through records pairwise comparisons.
Huge efforts have been dedicated to compare the most widely used algorithms in terms of runtime and memory consumption. No attention has been paid to the accuracy of resultant set of functional dependencies represented. Functional dependency accuracy is defined by two main factors; being complete and minimal.
In this paper, we are proposing a conceptual-based functional dependency detection framework. The proposed method is mainly based on Formal Concept Analysis (FCA); which is a mathematical framework rooted in lattice theory and is used for conceptual data analysis where data is represented in the form of a binary relation called a formal context [9]. From this formal context, a set of implications is extracted, these implications are in the same form of FDs. Implications are proven to be semantically equivalent to the set of all functional dependencies available in the certain database [10]. This set of implications should be the smallest set representing the formal context which is termed the Duquenne–Guigues, or canonical, basis of implications [11]. Moreover, completeness of implications is achieved through applying Armstrong rules discussed in [12].
The proposed framework is composed of three main components; they are:
Data transformation component: it converts input data to binary formal context.
Reduction component: it applies data reduction on tuples or attributes.
Implication extraction component: this is responsible for producing minimal and complete set of implications.
The key benefits of the proposed framework:
1 It works on any kind of input data (qualitative and quantitative) that is automatically transformed to a formal context of binary relation,
2 A crisp Lukasiewicz data reduction technique is implemented to remove redundant data, which positively helps reducing the total runtime,
3 The set of implications produced are guaranteed to be minimal; due to the use of Duquenne–Guigues algorithm in extraction,
The set of implications produced are guaranteed to be complete; due to the use of Armstrong rules.
The proposed framework is compared to the seven most commonly used algorithms listed above and evaluated based on runtime, memory consumption and accuracy using benchmark datasets.
Acknowledgement
This contribution was made possible by NPRP-07-794-1-145 grant from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
-
-
-
An Arabic Text-to-Picture Mobile Learning System
Authors: AbdelGhani Karkar, Jihad Al Ja'am and Sebti FoufouHandled devices and software applications are susceptible to ameliorate learning strength, awareness, and career development. Many mobile-based learning applications are obtainable from the market but Arabic learning shortage is not taken in consideration. We conduct an Arabic Text-to-Picture (TTP) mobile educational application which performs knowledge extraction and concept analysis to generate pictures that represent the content of the Arabic text. The knowledge extraction is based on Arabic semantic models cover important scopes for young children and new Arabic learners (i.e., grammar, nature, animals). The concept analysis uses semantic reasoning, semantic rules, and Arabic natural text processing (NLP) tool to identify word-to-word relationships. The retrieval of images is done spontaneously from local repository and online search engine (i.e., Google or Bing). The instructor can select the Arabic educational content, get semi-automatic generated pictures, and use them for explanation. Preliminary results show improvement in Arabic learning strength and memorization.
Keywords
Mobile Learning, Natural Language Processing, Ontology, Multimedia Learning, Engineering Education.
I. Introduction
Nowadays, mobile learning environment has been excessively used in diverse fields and has become a common matter in educational movement. In such an environment, learners are able to reach online educational materials from any location. Learners of Arabic language suffer from the lack of adequate resources. In fact, most of the educational software, tools, and web sites use classical techniques of introducing the concepts and explaining the vocabulary. We present in this paper a text-to-picture (TTP) educational mobile system that promotes Arabic children stories through semi-automatic generated pictures to illustrate their contents in an attractive manner. Preliminary results show that the system enhances the Arabic learners' comprehension, deduction and realization.
II. Background
Natural language processing (NLP) stresses the extraction of useful information and mining natural text. These information can be used to identify the scope of the text in order to generate summaries, classify contents and teach vocabulary. Diverse NLP-based systems that illustrate the text to images have been developed recently [1, 2, 3]. In general, these systems divide the text into segments and single words, access local multimedia resources, or explore the web to get pictures and images to illustrate the content.
All the proposed systems and techniques do not include the Arabic language. In this paper, we propose an Arabic TTP educational system using multimedia technology to teach children in an attractive way. Our proposal generates the multimedia tutorials dynamically by using Arabic text processing, entities relationship extraction, multimedia ontology, and online extraction of multimedia contents fetched from Google search engine.
III. Methodology:
In order to develop our system, we have created first the general system artwork, set the end user graphical user interface, design the semantic model that will store all semantic information about terms, and collect educational stories and analyze them. We have gathered 30 educational stories, annotated terms, and associated illustrations manually. Illustrations were gathered from the Internet and educational repository. The semantic model is developed using “Protégé editor”, a free open source ontology editor developed by Stanford [4]. The semantic model is composed from many classes that are referred to as concepts.
IV. The proposed system
The proposed system is a client-server application. When the server launched, it loads its packages and components, it loads the defined ontology, text parser components, and finally it opens a connection to listen for users' requests. Upon an effective connection trial, the user will be eligible to enter or open existing Arabic stories and process them. On the client side, the processing request and response of the story is done in a different thread, to keep the user able to continue his work without any interruption. Finally, server reply will be displayed for the user on his mobile device which consists from the processed Arabic story, related images, and different questions about an animal.
V. Conclusion
This study presents a complete system that automatically generates illustrations for Arabic stories through text processing, Arabic ontology, relationship extraction, and illustration generation. The proposed system belongs to learning technology which can be on mobile devices to teach children in an attractive and non-traditional style. Preliminary results demonstrate that the system improved learners' comprehension and realization.
References
[1] Bui, Duy, Carlos Nakamura, Bruce E Bray, and Qing Zeng-Treitler, “Automated illustration of patients instructions,” in AMIA Annual Symposium Proceedings, vol. 2012, pp. 1158, 2012.
[2] Li, Cheng-Te, Chieh-Jen Huang, and Man-Kwan Shan, “Automatic generation of visual story for fairy tales with digital narrative,” in Web Intelligence, vol. 13, pp. 115–122, 2015.
[3] Ustalov, Dmitry and R Kudryavtsev, “An Ontology Based Approach to Text to Picture Synthesis Systems,” in Proceedings of the 2nd International Workshop on Concept Discovery in Unstructured Data (CDUD 2012), 2012.
[4] Protégé. Ontology Editor Software. Available from: http://protege.stanford.edu, Accessed: September 2015.
-
-
-
Discovering the Truth on the Web Data: One Facet of Data Forensics
Authors: Mouhamadou Lamine Ba, Laure Berti-Equille and Hossam M. HammadyData Forensics with Analytics, or DAFNA for short, is an ambitious project initiated by the Data Analytics Research Group in Qatar Computing Research Institute, Hamad Bin Khalifa University. It main goal is to provide effective algorithms and tools for determining the veracity of structured information when they originate from multiple sources. The ability to efficiently estimate the veracity of data, along with the reliability level of the information sources, is a challenging problem with many real-world use cases (e.g., data fusion, social data analytics, rumour detection, etc.) in which users rely on a semi-automated data extraction and integration process in order to consume high quality information for personal or business purposes. DAFNA's vision is to provide a suite of tools for Data Forensics and investigate various research topics such as fact-checking and truth discovery and their practical applicability. We will present our ongoing development (dafna.qcri.org) on extensively comparing the state-of-the-art truth discovery algorithms, releasing a new system and the first REST API for truth discovery, and designing a novel hybrid truth discovery approach using active ensembling. Finally, we will briefly present real-world applications of truth discovery from Web data.
Efficient Truth Discovery. Truth discovery is a hard problem to deal with since there is no a priori knowledge about the veracity of provided information and the reliability level of online sources. This raises many questions about a thorough understanding of the state-of-the-art truth discovery algorithms and their applicability for actionable truth discovery. A new truth discovery approach is needed and it should be rather comprehensible and domain-independent. In addition, it should take advantage of the benefits of existing solutions, while being built on realistic assumptions for an easy use in real-world applications. In this context, we propose an approach that deals with open truth discovery challenges and consists of the following contributions: (i) The thorough comparative study of existing truth discovery algorithms; (ii) The design and release of the first online truth discovery system and the first REST API for truth discovery available at dafna.qcri.org; (iii) An hybrid truth discovery method using active ensembling; and (iv) An application to query answering related to Qatar where the veracity of information provided by multiple Web sources is estimated.
-
-
-
Identifying Virality Attributes of Arabic Language News Articles
Authors: Sejeong Kwon, Sofiane Abbar and Bernard J. JansenOur research is focused on expanding the reach and impact of Arabic language news articles by attracting more readers. In pursuit of this research goal, we analyze attributes that result in certain news articles becoming viral, relative to other news articles that do not become viral or so viral. Specifically, we focus on Arabic language news articles, as Arabic language articles have unique linguistic, cultural, and social constrains relative to most Western languages news stories. In order to understand virality, we take two approaches, a time series and linguistical, in an Arabic language data of more than 1,000 news articles with associated temporal traffic data. For data collection, we select (Kasra, “a breaking”) (http://kasra.co/) is an Arabic language online news site that targets Arabic language speakers worldwide, but particularly in the Middle East North Africa (MENA) region. We gathered more than 3,000 articles, originally, then gathered traffic data for this set of articles, reducing the set to more than 1,000 with complete traffic data. We focus first on the temporal attributes in order to categorize clusters of virality with this set of articles. Then, with topical analysis, we seek to identify linguistical aspects common to articles within each virality cluster identified by time series. Based on results from the time series analysis, we cluster articles based on common temporal characteristics of traffic access. Once clustered by time series, we analyze each cluster for content attributes, topical and linguistical, in order to identify specific attributes that may be causing the virality of articles within each times-series cluster. To compute dissimilarity for time-series, we utilize and evaluate the performance of several state-of-the-art time series dissimilarity-based clustering approaches, such as dynamic time warping, discrete wavelet transformation, and others. To identify the dissimilarity algorithm with the most discriminating power, we conduct a principal component analysis (PCA), which is a statistical technique used to highlight variations and patterns in a dataset. Based on findings from our PCA, we select discrete wavelet transformation-based dissimilarity as the best times-series algorithm for our research because the resulting principal axes explain more proportion of variability (75.43 percent) relative to the other time-series algorithms that we had employed. We identify five virality clusters using times series. For topic modeling, we employ Latent Dirichlet allocation (LDA) for this portion of the research. LDA is a generative probabilistic model for collections of discrete data, such as text, LDA explains similarities among groups of observations within a data set. For text modeling, the topic probabilities of LDA provide an explicit representation of a document. For the topical classification analysis, we use Linguistic Inquiry and Word Count (LIWC), which is a sentiment analysis tool. LIWC is a text processing program based on occurrences of words in several categories covering writing style and psychological meaning. Prior empirical work shows the value of a LIWC linguistic analysis for detecting meanings in various experimental settings, including attention focus, thinking style, and social relationships. In terms of results, surprising, the article topic is not predictive of virality of Arabic language news articles. Instead we find that linguistical aspects and style of the news article is the most predictive attribute for predicting virality for Arabic news articles. In analyzing the attributes of virality in Arabic language news articles, our research finds that, perhaps counter intuitively, the topic of the article does not impact the virality. Instead, we find that style of the article is the most impactful attribute for predicting virality for Arabic news articles. Building on these findings, we will leverage aspects of the news articles with other factors to develop tools to assist content creators to more effectively reach their user segment. Our research results will assist in understanding the virality of Arabic news and ultimately improve readership and dissemination of Arabic language news articles.
-
-
-
Efforts Towards Automatically Generating Personas in Real-time Using Actual User Data
Authors: Bernard J. Jansen, Jisun An, Haewoon Kwak and Hoyoun ChoThe use of personas is an interactive design technique with considerable potential for product and content development. A persona is a representation of a group or segment of users, sharing common behavioral characteristics. Although representing a segment of users, a persona is generally developed in the form of a detailed narrative about an explicit but fictitious individual that represents the collection of users possessing similar behaviors or characteristics. In order to make the fictitious individual appear as real person to the product developers, the persona narrative usually contains a variety of both demographic and behavioral details about socio economic status, gender, hobbies, family members, friends, possessions, among many other data. Also, the narrative of a persona normally also addresses the goals, needs, wants, frustrations and other emotional aspects of the fictitious individual that are pertinent to the product being designed. However, personas have typically been viewed as fairly static. In this research, we demonstrate an approach for creating and validating personas in real time, based on automated analysis of actual user data. Our data collection site and research partner is AJ+ (http://ajplus.net/), which is a news channel from Al Jazeera Media Network that is natively digital with a presence only on social media platforms and a mobile application. Its media concept is unique in that AJ+ was designed from the ground up to serve news in the medium of viewer, versus a teaser in one medium with a redirect to a website. In pursuit of our overall research objective of automatically generating personas in real time, for research reported in this manuscript, we are specifically interested in understanding the AJ+ audience by identifying (1) whom are they reaching (i.e., market segment) and (2) what competitive (i.e., non-AJ+) content are associated with each market segment. Focusing on one aspect of user behavior, we collect 8,065,350 instances of sharing of links by 54,892 users of an online news channel, specifically examining the domains these users share. We then cluster users based on similarity of domains shared, identifying seven personas based on this behavioral aspect. We conduct term-frequency – inverse document frequency (tf-idf) vectorization. We remove outliers of less than 5 shares (too unique) and more than 80% of the all users' shares (too popular). We use K-means++ clustering (K = 2.. 10), which is an advanced version of K-means to improve selection of initial seeds, because K-means++ effectively works for a very sparse matrix (user-link). We use the “elbow” method to choose the optimal number of clusters, which is eight in this case. In order to characterize each cluster, we list top 100 domains from each cluster and discover that there are large overlaps among clusters. We then remove from each cluster the domains that existed in another cluster in order to identify the relevant, unique, and impactful domains. This de-duplication results in the elimination of one cluster, leaving us with a set of clusters, where each cluster is characterized by domains that are shared only by users within that cluster. We note that the K-means++ clustering method can be replaced easily with other clustering methods in various situations. Demonstrating that these insights can be used to develop personas in real-time, the research results provide insights into competitive marketing, topic interests, and preferred system features for the users of the online news medium. Using the description of each of shared links, we detect their languages. 55.2% (30,294) users share links in one just language and 44.8% users share links in multiple languages. The most frequently used language is English (31.98%), followed by German (5.69%), Spanish (5.02%), French (4.75%), Italian (3.46%), Indonesian (2.99%), Portuguese (2.94%), Dutch (2.94%), Tagalog1 (2.71%), and Afrikaans (2.69%). As there were millions of domains shared, we utilize the top one hundred domains for each cluster, resulting in 700 top domains shared by the 54,892 AJ+ users. We, as mentioned, de-duplicated, resulting in the elimination of a cluster (11,011 users, 20.06%). So, we have seven unique clusters based on sharing of domains representing 43,881 users. We then demonstrate how these findings can be leveraged to generate real-time personas based on actual user data. We stream the data analyze results into a relational database, combine the results with other demographic data that we gleaned from available sources such as Facebook and other social media accounts, using each of the seven clusters as representative of a persona. We give each persona a fictional name and use a stock photo as the face of our personas. Each persona was linked to the top alternate (i.e., non-AJ+) domains they most commonly shared with the personas shared links updateable with new data. Research implications are that personas can be generated in real-time, instead of being the result of a laborious, time-consuming development process.
-
-
-
Creating Instructional Materials with Sign Language Graphics Through Technology
Authors: Abdelhadi Soudi and Corinne VinopolEducation of deaf children in the developing world is very dire and there is a dearth of sign language interpreters to assist them with translation for sign language-dependent students in the classrooms. Illiteracy within the deaf population is rampant. Over the past several years, a unique team of Moroccan and American deaf and hearing researchers have united to enhance the literacy of deaf students by creating tools that incorporate Moroccan Sign Language (MSL) and American Sign Language (ASL) under funding grants from USAID and the National Science Foundation (NSF). MSL is a gestural language distinct from both the spoken languages and written language of Morocco and has no text representation. Accordingly, translation is quite challenging and requires representation of MSL in graphics and video.
Many deaf and hard of hearing people do not have good facility with their native spoken language because they have no physiological access to it. Because oral languages depend, to a great extent, upon phonology, reading achievement of deaf children usually falls far short of that of hearing children of comparable abilities. And, by extension, reading instructional techniques that rely on phonological awareness, letter/sound relationships, and decoding, all skills proven essential for reading achievement, have no sensory relevance. Even in the USA, where statistics are available and education of the deaf is well advanced, on average, deaf high school graduates have a fourth grade reading level; only 7–10% of deaf students read beyond a seventh to eighth grade reading level; and approximately 20% of deaf students leave school with a second grade or below reading level (Gallaudet University's national achievement testing programs (1974, 1983, 1990, and 1996); Durnford, 2001; Braden, 1992; King & Quigley, 1985; Luckner, Sebald, Cooney, Young III, & Muir, 2006; Strong, & Prinz, 1997).
Because of spoken language inaccessibility, many deaf people rely on a sign language. Sign language is a visual/gestural language that is distinct from spoken Moroccan Arabic and Modern Standard/written Arabic and has no text representation. It can only be depicted via graphics, video, and animation.
In this presentation, we present an innovative technology Clip and Create, a tool for automatic creation of sign language supported instructional material. The technology has two tools– Custom Publishing and Instructional Activities Templates, and the following capabilities:
(1)Automatically constructs customizable publishing formats;
(2)Allows users to import Sign Language clip art and other graphics;
(3)Allows users to draw free-hand orusere-sizable shapes;
(4)Allows users to incorporate text, numbers, and scientific symbols in various sizes, fonts, and colors;
(5)Saves and prints published products;
(6)Focuses on core vocabulary, idioms, and STEM content;
(7)Incorporates interpretation of STEM symbols into ASL/MSL;
(8)Generates customizable and printable Instructional Activities that reinforce vocabulary and concepts found in instructional content using Templates:
a. Sign language BINGO cards,
b. Crossword puzzles,
c. Finger spelling/spelling scrambles,
d. Word searches (in finger spelling and text),
e. Flashcards (with sign, text, and concept graphic options), and
f. Matching games (i.e., Standard Arabic-to-MSL and English-to-ASL).
(cf. Figure 1: Screenshots from Clip and Create)
The ability of this tool to efficiently create bilingual (i.e., MSL and written Arabic and ASL and English) educational materials will have a profound positive impact on the quantity and quality of sign-supported curricular materials teachers and parents are able to create for young deaf students. And, as a consequence, deaf children will show improved vocabulary recognition, reading fluency, and comprehension.
A unique aspect of this software is that written Arabic is used by many Arab countries even though the spoken language varies. Though there are variations in signs as well, there is enough consistency to make this product useful in other Arab-speaking nations as is. Any signing differences can easily be adjusted by swapping sign graphic images.
-