Qatar Foundation Annual Research Conference Proceedings Volume 2014 Issue 1
- تاريخ المؤتمر: 18-19 Nov 2014
- الموقع: Qatar National Convention Center (QNCC), Doha, Qatar
- رقم المجلد: 2014
- المنشور: ١٨ نوفمبر ٢٠١٤
381 - 400 of 480 نتائج
-
-
Annotation Guidelines For Non-native Arabic Text In The Qatar Arabic Language Bank
المؤلفون: Wajdi Zaghouani, Nizar Habash, Behrang Mohit, Abeer Heider, Alla Rozovskaya and Kemal OflazerAnnotation Guidelines for Non-native Arabic Text in the Qatar Arabic Language Bank The Qatar Arabic Language Bank (QALB) is a corpus of naturally written unedited Arabic and its manual edited corrections. QALB has about 1.5 million words of text written and post-edited by native speakers. The corpus was the focus of a shared task on automatic spelling correction in the Arabic Natural Language Processing Workshop that was held in conjunction with 2014 Conference on Empirical Methods for Natural Language Processing (EMNLP) in Doha, with nine research teams from around the world competing. In this poster we discuss some of the challenges of extending QALB to include non-native Arabic text. Our overarching goal is to use QALB data to develop components for automatic detection and correction of language errors that can be used to help Standard Arabic learners (native and non-native) improve the quality of the Arabic text they produce. The QALB annotation guidelines have focused on native speaker text. Learners of Arabic as a second language (L2 speakers) typically have to adapt to a different script and a different vocabulary with new grammatical rules. These factors contribute to the propagation of errors made by L2 speakers that are of different nature than those produced by native speakers (L1 speakers), who are mostly affected by their dialects and levels of education and use of standard Arabic. Our extended L2 guidelines build on our L1 guidelines with a focus on the types of errors usually found in the L2 writing style and how to deal with problematic ambiguous cases. Annotated examples are provided in the guidelines to illustrate the various annotation rules and their exceptions. As with the L1 guidelines, the L2 texts should be corrected with a minimum number of edits that produce semantically coherent (accurate) and grammatically correct (fluent) Arabic. The guidelines also devise a priority order for corrections that prefer less intrusive edits starting with inflection, then cliticization, derivation, preposition correction, word choice correction, and finally word insertion. This project is supported by the National Priority Research Program (NPRP grant 4-1058-1-168) of the Qatar National Research Fund (a member of the Qatar Foundation). The statements made herein are solely the responsibility of the authors.
-
-
-
Truc: Towards Trusted Communication For Emergency Scenarios In Vehicular Adhoc Networks (vanets) Against Illusion Attack
المؤلفون: Maria Elsa Mathew and Arun Raj Kumar P,With data proliferation at an unprecedented rate, data accessibility while in motion has increased the demand of VANETs of late. VANETs use moving cars as nodes to create a mobile network. VANETs are designed with the goals of enhancing driver safety and providing passenger comfort. Providing security to VANETs is important in terms of providing user anonymity, authentication, integrity and privacy of data. Attacks in VANETs include the Sybil attack, DDoS attack, misbehaving and faulty nodes, sinkhole attack, spoofing, traffic analysis attack, position attack and illusion attack. Illusion attack is a recent threat in which the attacker generates fraud traffic messages to mislead the passengers thereby changing the passenger's driving behaviour. Thus, the attacker achieves his goal by moving the target vehicle along a traffic free route and creating traffic jams in areas where he wishes. Illusion attack is devised mainly by thieves and terrorists who require a clean get away path. The existing method used to prevent Illusion attack is the Plausibility Validation Network (PVN). In PVN the message validation is based on a set of rules depending on the message type. This is an overhead as the rule set for all possible message types have to be stored and updated in the database. Hence, an efficient mechanism is required to prevent illusion attacks. In this paper, our proposed system: TRUC verifies the message by a Message Content Validation (MCV) Algorithm thus ensuring the safety of the car drivers and passengers. The possibilities of attacker creating an Illusion attack is explored in all dimensions and the security goals are analyzed by our proposed design, TRUC
-
-
-
Software-hardware Co-design Approach For Gas Identification
المؤلفون: Amine Ait Si Ali, Abbes Amira, Faycal Bensaali, Mohieddine Benammar, Muhammad Hassan and Amine BermakGas detection is one of the major processes that has to be taken into consideration as an important part of a monitoring system for production and distribution of gases. Gas is a critical resource. Therefore, for safety reason, it is imperative to keep monitoring in real time all parameters such as temperature, concentration and mixture. The presented research in this abstract on gas identification is part of an ongoing research project aiming at the development of a low power reconfigurable self-calibrated multi-sensing platform for gas applications. Gas identification system can be described as a pattern recognition problem. Decision tree classifier is a widely used classification technique in data mining due to its low implementation complexity. It is a supervised learning that consists in a succession of splits that leads to the identification of the predefined classes. The decision tree algorithm has been applied and evaluated for the hardware implementation of a gas identification system. The data used for training is collected from a 16-Array SnO2 gas sensor, the sensor array is exposed to three types of gases (CO, Ethanol and H2) at ten different concentrations (20, 40, 60, 80, 100, 120, 140, 160, 180 and 200ppm), the experiment is repeated twice to generate 30 patterns for training and another 30 patterns for testing. Training is performed in MATLAB. It is first done using the raw data, which is the steady states, and then using transformed data by applying principal component analysis. Table 1 shows the decision tree training results. These include the trees obtained from the learning on the original data and on different combinations of principal components. The resulted models are implemented in C and synthesised using Vivado High Level Synthesis (HLS) tool for a quick prototyping on the heterogeneous Zynq platform. Table 2 illustrates the on-chip resources usage, maximum running frequency and latency for the implementation of the trained decision tree models. The use of Vivado HLS helped to optimise the hardware design by applying different directives such as the one that allow loop unrolling for a better parallelism. The best performance is obtained when the first three principal components were used for the training which resulted in an accuracy of 90%. The hardware implementation illustrated that a trade-off is to be found between the accuracy of the identification and the performance in terms of processing time. It is planned to use drift compensation techniques to get more accurate steady states and increase the performance of the system. The system can be easily adapted to other types of gases by exposing the new gas to the sensor array, collecting the data, performing the training and finally implementing the model.
-
-
-
E-government Alerts Correlation Model
المؤلفون: Aadil Salim Al-mahrouqi, Sameh Abdalla and Tahar KechadiBackground & Objectives Qatar's IT infrastructure is rapidly growing to encompass the evolution of businesses and economical growth the country is increasingly witnessing throughout its industries. It is now evident that the country's e-government requirements and associated data management systems are becoming large in number, highly dynamic in nature, and exceptionally attractive for cybercrime activities. Protecting the sensitive data e-government portals are relying on for daily activities is not a trivial task. The techniques used to perform cybercrimes are becoming sophisticated relatively with the firewalls protecting them. Reaching high-level of data protection, in both wired and wireless networks, in order to face recent cybercrime approaches is a challenge that is continuously proven hard to achieve. In a common IT infrastructure, the deployed network devices contain a number of event logs that reside locally within its memory. These logs are in large numbers, and therefore, analyzing them is a time consuming task for network administrators. In addition, a single network event often generates a redundancy of similar event logs that belong to the same class within short time intervals. This makes it difficult to manage them during forensics investigation. In most cybercrime cases, a single alert log does not contain sufficient information about malicious actions background and invisible network attackers. The information for a particular malicious action or attacker is often distributed among multiple alert logs and among multiple network devices. Forensic investigators mission is to reconstruct incident scenarios is now very complex considering the number as well as the quality of these event logs. Methods My research will focus on involving mathematics and algorithm science for each proposed sub models in the alerts correlation model. After collecting alert logs from network sensors; then it will be stored in the alert logs warehouse. The stored alert log contains redundancy data and irrelevant information. The alert correlation model used to filter out all redundancy data and irrelevant information from the alert logs. This model contains two stages; format standardization and redundancy management. The format standardization process aims unified different event logs format into one format, while the redundancy management process aims to reduce the duplication of the single event. Furthermore, this research will try to utilized criminology science to enhance security level of the proposed model and forensics experiments tools to validate the proposal approach. Results In response to attacks and potential of attacks against network infrastructure and assets, my research focuses on how to build an organized legislative e-government environment. The idea of this approach is to forensically utilize the current network security output by collect, analysis and present evidence of network attack in an efficient manner. After data mining process we can utilize our preprocessing results for e-government awareness purpose. Conclusions This research proposed Qatar e-government alerts correlation model. The proposed model used to process and normalize the captured network event logs. The main point of designing the model is to find a way to forensically visualize the evidence and attack scenario in e-government infrastructure.
-
-
-
First Hybrid 1gbps/0.1 Gbps Free-space Optical /rf System Deployment And Testing In The State Of Qatar
المؤلفون: Syed Jawad Hussain, Abir Touati, Mohammad Elamri, Hossein Kazemi, Farid Touati and Murat UysalI.BACKGROUND & OBJECTIVES Owing to its high-bandwidth, robustness to EMI, and operation in unregulated spectrum, free-space optical communication (FSO) is uniquely qualified as a promising alternative or complementary technology to fiber optic and wireless radio-frequency (RF) links. Despite the vibrant advantages of FSO technology and the variety of its applications, its widespread adoption has been hampered by rather disappointing link reliability for long-range links due to atmospheric turbulence-induced fading and sensitivity to detrimental climate conditions. A major challenge of such hybrid systems is to provide a strong backup system with soft-switching capabilities when the FSO link becomes down. The specific objective of this work is to study for the first time in Qatar and the GCC the link capacity, link availability, and link outage of an FSO system with RF back up (i.e. hybrid FSO/RF) under harsh environment. II.METHODS In this work, a practical demonstration of hybrid FSO/RF link system is shown. The system has a capacity of 1 Gbps and 100 Mbps for FSO and RF, respectively. It is installed in Qatar University at two different buildings 600 m away and 20 feet high. This system is basically a point-to-point optical link that uses Infrared laser lights to wirelessly transmit data. Moreover, the proposed system has capability to make parallel transmission between links. In order to analyze the two transport media, we used the tool IPERF. This Java based GUI (jperf) application can either act as a server or client, and is available on a variety of platforms. We have tested end-to-end throughput by running IPERF tool in server mode on one Laptop and in client mode on another. III.RESULTS Figure1 shows a block diagram of the system used. Initial results were obtained for the two links under same climatic and environmental conditions, where the average ambient temperature reached 50°C and RH above 80% (July-August 2014). Both FSO and RF links allowed transfer rates of around 80% of their full capacity. During all experiments while running both links simultaneously, there was no FSO link failure. In case of an FSO failure, the RF is expected to back up within 2 seconds (hard switching), which might cause a loss of data. Detailed results on FSO-to-RF switching and induced packet loss will be reported in the full manuscript and during the presentation. IV.CONCLUSION Tests on FSO/RF link have been carried for the first time in Qatar. Initial results showed that both FSO and RF links operated close to their capacity. During summer, Qatari weather did not induce FSO link outage. The team is focusing on developing a seamless FSO-RF soft switching using NetFPGA boards and raptor coding.
-
-
-
Wigest: A Ubiquitous Wifi-based Gesture Recognition System
المؤلفون: Heba Abdelnasser, Khaled Harras and Moustafa YoussefMotivated by freeing the user from specialized devices and leveraging natural and contextually relevant human movements, gesture recognition systems are becoming popular as a fundamental approach for providing HCI alternatives. Indeed, there is a rising trend in the adoption of gesture recognition systems into various consumer electronics and mobile devices. These systems, along with research enhancing them by exploiting the wide range of sensors available on such devices, generally adopt various techniques for recognizing gestures including computer vision, inertial sensors, ultra-sonic, and infrared. While promising, these techniques experience various limitations such as being tailored for specific applications, sensitivity to lighting, high installation and instrumentation overhead, requiring holding the mobile device, and/or requiring additional sensors to be worn or installed. We present WiGest, a ubiquitous WiFi-based hand gesture recognition system for controlling applications running on off-the-shelf WiFi-equipped devices. WiGest does not require additional sensors, is resilient to changes within the environment, and can operate in non-line-of-sight scenarios. The basic idea is to leverage the effect of the in-air hand motion on the wireless signal strength received by the device from an access point to recognize the performed gesture. As shown in Figure 1, WiGest parses combinations of signal primitives along with other parameters, such as the speed and magnitude of each primitive, to detect various gestures, which can then be mapped to distinguishable application actions. There are several challenges we address WiGest including handling noisy RSSI values due to multipath interference and other electromagnetic noise in the wireless medium; handling gesture variations and their attributes for different humans or the same human at different times; handling interference due to the motion of other people within proximity of the user's device; and finally being energy-efficient to suit mobile devices. To address these challenges, WiGest leverages different signal processing techniques that can preserve signal details while filtering out the noise and variations in the signal. We implement WiGest on off-the-shelf laptops and evaluate frequencies on the RSSI, creating a signal composed of three primitives: rising edge, falling edge, and pause. We evaluate its performance with different users in an apartment and engineering building settings. Various realistic scenarios are tested covering more than 1000 primitive actions and gestures each in the presence of interfering users in the same room as well as other people moving in the same floor during their daily life. Our results show that WiGest can detect the basic primitives with an accuracy of 90% using a single AP for distances reaching 26 ft including through-the-wall non-line-of-sight scenarios. This increases to 96% using three overheard APs, a typical case for many WiFi deployments. When evaluating the system using a multimedia player application case study, we achieve a classification accuracy of 96%.
-
-
-
Physical Layer Security For Communications Through Compound Channels
المؤلفون: Volkan Dedeoglu and Joseph BoutrosSecure communications is one of the key challenges faced in the field of information security as the transmission of information between legitimate users is vulnerable to interception by illegitimate listeners. The state-of-the -art secure communication schemes employ cryptographic encryption methods. However, the use of cryptographic encryption methods requires generation, distribution and management of keys to encrypt the confidential message. Recently, physical layer security schemes that exploit the difference between the channel conditions of the legitimate users and the illegitimate listeners have been proposed for enhanced communication security. We propose novel coding schemes for secure transmission of messages over compound channels that provides another level of security in the physical layer on top of the existing cryptographic security mechanisms in the application layer. Our aim is to provide secrecy against illegitimate listeners while still offering good communication performance for legitimate users. We consider the transmission of messages over compound channels, where there are multiple parallel communication links between the legitimate users and an illegitimate listener intercepts one of the communication links that is unknown to the legitimate users. We propose a special source splitter structure and a new family of low density parity check code ensembles to achieve secure communications against an illegitimate listener and provide error correction capability for the legitimate listener. First, the source bit sequence is split into multiple bit sequences by using a source splitter. The source splitter is required to make sure that the illegitimate listener does not have access to the secret message bits directly. Then, a special error correction code is applied to the bit sequences, which are the outputs of the source splitter. The error correction code is based on a special parity check matrix which is composed of some subblocks with specific degree distributions. We show that the proposed communication schemes can provide algebraic and information theoretic security. Algebraic security means that the illegitimate listener is unable to solve any of the individual binary bits of the secret message. Furthermore, information theoretic security guarantees the highest level of secrecy by revealing no information to the illegitimate listener about the secret message. The error correction encoder produces multiple codewords to be sent on parallel links. Having access to the noisy outputs of the parallel links, the legitimate receiver recovers the secret message. The finite length performance analysis of the proposed secure communications scheme for the legitimate listener shows good results in terms of the bit error rate and the frame error over binary input additive white Gaussian noise channel. The asymptotic performance analysis of our scheme for a sufficiently large block length is found via the density evolution equations. Since the proposed low density parity check code is a multi-edge type code on graphs, there are two densities that characterize the system performance. The thresholds obtained by the density evolution equations of our scheme show comparable or improved results when compared to the fully random low density parity check codes.
-
-
-
Sparsity-aware Multiple Relay Selection In Large Decode-and-forward Relay Networks
المؤلفون: Ala Gouissem, Ridha Hamila, Naofal Al-dhahir and Sebti FoufouCooperative communication is a promising technology that has attracted significant attention recently thanks to its ability to achieve spatial diversity in wireless networks with only single-antenna nodes. The different nodes of a cooperative system can share their resources so that a virtual Multiple Input Multiple Output (MIMO) system is created which leads to spatial diversity gains. To exploit this diversity, a variety of cooperative protocols have been proposed in the literature under different design criteria and channel information availability assumptions. Among these protocols, two of the most-widely used are the amplify-and-forward (AF) and decode-and-forward (DF) protocols. However, in large-scale relay networks, the relay selection process becomes highly complex. In fact, in many applications such as device-to-device (D2D) communication networks and wireless sensor networks, a large number of cooperating nodes are used, which leads to a dramatic increase in the complexity of the relay selection process. To solve this problem, the sparsity of the relay selection vector has been exploited to reduce the multiple relay selection complexity for large AF cooperative networks while also improving the bit error rate performance. In this work, we extend the study from AF to large-scale decode-and-forward (DF) relay networks. Based on exploiting the sparsity of the relay selection vector, we propose and compare two different techniques (referred to as T1 and T2) that aim to improve the performance of multiple relay selection in large-scale decode-and-forward relay networks. In fact, when only few relays are selected from a large number of relays, the relay selection vector becomes sparse. Hence, utilizing recent advances in sparse signal recovery theory, we propose to use different signal recovery algorithms such as the Orthogonal Matching Pursuit (OMP) to solve the relay selection problem. Our theoretical and simulated results demonstrate that our two proposed sparsity-aware relay selection techniques are able to improve the outage performance and reduce the computation complexity at the same time compared with conventional exhaustive search (ES) technique. In fact, compared to ES technique, T1 reduces the selection complexity by O(K^2 N) (where N is the number of relays and K is the number of selected relays) while outperforming it in terms of outage probability irrespective of the relays' positions. Technique T2 provides higher outage probability compared to T1 but reduces the complexity making a compromise between complexity and outage performance. The best selection threshold for T2 is also theoretically calculated and validated by simulations which enabled T2 to also improve the outage probability compared with ES techniques. Acknowledgment This publication was made possible by NPRP grant #6-070-2-024 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
-
-
-
Inconsistencies Detection In Islamic Texts Of Law Interpretations ["fatawas"]
المؤلفون: Jameela Al-otaibi, Samir Elloumi, Abdelaali Hassaine and Ali Mohamed JaouaIslamic web content offers a very convenient way for people to learn more about Islam religion and the correct practices. For instance, via these web sites they could ask for fatwas (Islamic advisory opinion) with more facilities and serenity. Regarding the sensitivity of the subject, large communities of researchers are working on the evaluation of these web sites according to several criteria. In particular there is a huge effort to check the consistency of the content with respect to the Islamic shariaa (or Islamic law). In this work we are proposing a semiautomatic approach for evaluating the web sites Islamic content, in terms of inconsistency detection, composed of the following steps: (i) Domain selection and definition: It consists of identifying the most relevant named entities related to the selected domain as well as their corresponding values or keywords (NEV). At that stage, we have started building the Fatwas ontology by analyzed around 100 fatwas extracted from the online system. (ii) Formal representation of the Islamic content: It consists of representing the content as formal context relating fatwas to NEV. Here, each named entity is split into different attributes in the database where each attribute is associated to a possible instantiation of the named entity. (iii) Rules extraction: by applying the ConImp tools, we extract a set of implications (or rules) reflecting cause-effect relations between NEV. As an extended option aiming to provide more precise analysis, we have proposed the inclusion of negative attributes. For example for word "licit", we may associate "not licit" or "forbidden", for word "recommended" we associated "not recommended", etc. At that stage by using an extension of Galois Connection we are able to find different logical associations in a minimal way by using the same tool ConImp. (iv) Conceptual reasoning: the objective is to detect a possibly inconsistency between the rules and evaluate their relevance. Each rule is mapped to a binary table in a relational database model. By joining obtained tables we are able to detect inconsistencies. We may also check if a new law is not contradicting existing set of laws by mapping the law into a logical expression. By creating a new table corresponding to its negation we have been able to prove automatically its consistencies as soon as we obtain an empty join of the total set of joins. This preliminary study showed that the logical representation of fatwas gives promising results in detecting inconsistencies within fatwa ontology. Future work includes using automatic named entity extraction and automatic transformation of law into a formatted database; we should be able to build a global system for inconsistencies detection for the domain. ACKNOWLEDGMENT: This publication was made possible by a grant from the Qatar National Research Fund through National Priority Research Program (NPRP) No. 06-1220-1-233. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Qatar National Research Fund or Qatar University.
-
-
-
A Low Power Reconfigurable Multi-sensing Platform For Gas Application
Presence of toxic gases and accidental explosions in gas industries have turned the researcher to innovate an electronic nose system which can indicate the nature and the parameters of the gas passing through different vessels. Therefore, in this research we propose a low power Radio Frequency Identification (RFID) based gas sensor tag which can monitor the parameters and indicate the type of gas. The research work is divided in to three main parts. The first two parts cover the design and analysis of low power multi-sensors and processing unit, while the last part focuses on a passive RFID module which can provide communication between the sensor and the processing unit, as shown in Fig. 1. In passive RFID applications, power consumption is one of the most prominent parameter because most of the power is harvested from the coming RF signal. Therefore a ring-oscillator based low power temperature sensor is designed to measure the gas thermodynamic conditions. The oscillator is designed using the Thyristor based delay element [7], in which the current source present for temperature compensation has been displaced to make the delay element as temperature dependent. The proposed temperature sensor consumes 47nW power at 27 °C, which increases linearly with temperature. Moreover, a 4x4 array of tin-oxide gas sensor based on convex Micro hotplates (MHP), is also utilized to identify the type of gas. The array is designed such that each sensor of an array provide different pattern for the same gas. The power consumption caused by the temperature and gas sensor is in the order of few µW's. The prime advantage of MHP can be visualized by the 950 °C annealed MHP, which exhibit the thermal efficiency of 13 °C /mW. Moreover it requires a driving voltage of only 2.8V to reach 300 °C in less than 5ms, which make it compatible with power supplies required by CMOS ICs. The gas sensor will provide 16 feature points at a time, which can results in hardware complexity and throughput degradation of the processing unit. Therefore, a principle component analysis (PCA) algorithm is implemented to reduce the number of feature points. Thereafter, a binary decision tree algorithm is adopted to classify the gases. We implemented both algorithms on heterogeneous Zynq platform. It is observed that the execution of PCA on Zynq programmable SoC is 1.41 times faster than the corresponding software execution, with a resource utilization of only 23% . Finally, a passive ultrahigh-frequency (UHF) RFID transponder is developed for communicating between the sensing block and processing unit. The designed module is responsible to harvest the power from the coming RF signal and accomplish the power requirement of both sensors. The designed transponder IC achieved minimum sensitivity of -17dBm with a minimum operational power of 2.6µW.
-
-
-
Utilizing Monolingual Gulf Arabic Data For Qatari Arabic-english Statistical Machine Translation
المؤلفون: Kamla Al-mannai, Hassan Sajjad, Alaa Khader, Fahad Al Obaidli, Preslav Nakov and Stephan VogelWith the recent rise of social media, Arabic speakers have started increasingly using dialects in writing, which has constituted research in dialectal Arabic (DA) as a field of interest in natural language processing (NLP). DA NLP is still in its infancy, both in terms of its computational resources and in its tools, e.g. lack of dialectal morphological segmentation tools. In this work, we present a 2.7M-token collection of monolingual corpora of Gulf Arabic extracted from the Web. The data is unique since it is genre-specific, i.e. romance genre, in spite of the various sub-dialects of Gulf Arabic that it covers, e.g., Qatari, Emirati, Saudi. In addition to the monolingual Qatari data collected, we use existing parallel corpora of Qatari (0.47M-token), Egyptian (0.3M-token), Levantine (1.2M-token) and Modern Standard Arabic (MSA) (3.5M-token) to English to develop a Qatari Arabic to English statistical machine translation system (QA-EN SMT). We exploit the monolingual data to 1) develop a morphological segmentation tool for Qatari Arabic, 2) generate a uniform segmentation scheme for the various variants of Arabic employed, and 3) build a Qatari language model in the opposite translation direction. Proper morphological segmentation of Arabic plays a vital role in the quality of a SMT system. Using the monolingual Qatari data collected in combination with the QA side of the small QA-EN existing parallel data, we trained an unsupervised morphological segmentation model for Arabic, i.e. Morfessor, to create a word segmenter for Qatari Arabic. We then extrinsically compare the impact of the resulting segmentation (as opposed to using tools for MSA) on the quality of QA-EN machine translation. The results show that this unsupervised segmentation can yield better translation quality. Unsurprisingly, we found that removing the monolingual data from the training set of the segmenter affects the translation quality with a loss of 0.9 BLEU points. Arabic dialect resources, when adapted for the translation of one dialect are generally helpful in achieving better translation quality. We show that a standard segmentation scheme can improve vocabulary overlap between dialects by segmenting words with different morphological forms in different dialects to a common root form. We train a generic segmentation model for Qatari Arabic and the other variants used using the monolingual Qatari data and the Arabic side of the parallel corpora. We train the QA-EN SMT system using the different parallel corpora (one at a time) in addition to the QA-EN parallel corpus segmented using the generic statistical segmenter. We show a consistent improvement of 1.5 BLEU points when compared with their respective baselines with no segmentation. In the reverse translation direction, i.e. EN_QA, we show that adding a small amount of in-domain data to the language model used results in a relatively large improvement compared to the degradation resulted by adding a large amount of out-of-domain data.
-
-
-
Securing The E-infrastructure In Qatar Through Malware Inspired Cloud Self-protection
المؤلفون: Elhadj Benkhelifa and Thomas WelshWhilst the state of security within the Cloud is still a contentious issue, some privacy and security issues are well known or deemed to be a likely threat. When considering the ongoing threat of malicious insiders the promised security expertise might be deemed untrusted. The focus of our research is determining the extent of issues related to the underlying technology, which support Cloud environments, mainly virtualization platforms. It is often argued that virtualization is secure over conventional shared resources due to the inherent isolation. However much literature may be seen which cites examples to the contrary and as such it should be considered that, as with all software, virtualization applications are susceptible to exploitation and subversion. In fact, it might even be argued that the complexity and heterogeneous nature of the environment may even facilitate further security vulnerabilities. To illustrate and investigate this point we consider the security threat of malware within the context of cloud environments. With this evolution of malware combined with the knowledge that Cloud software is susceptible to vulnerabilities, it is argued that complex malware might exist for the Cloud and if it were successful, would shed light on the security of these technologies. Whilst there are many examples of state of the art malware detection and protection for Cloud environments, this work tends to focus on examining virtual machines (VMs) from another layer. The primary flaw identified in all of the current approaches is failing to take into account malware, which is aware of the Cloud environment; thus be in a position to subvert this detection process. Traditional malware security applications tend to take a defensive approach by looking for existing malware through signature analysis or behavior monitoring. Whilst such approaches are acceptable for traditional environments they become less effective for distributed and dynamic ones. We argued that due to this dynamic nature of the Cloud as well as its uncertain security concerns, a malware type application may be a suitable security defense and thus operate as a proactive, self-protecting element. We present an architecture for Multi-Agent Cloud-Aware Self-Propagating Agents for Self-Protection. By adapting this architecture to include constraints (such as a kill switch) the application may be effectively controlled and thus any negative effects minimized. This application will then cross the multiple layers within the network, having high privilege. Its dynamic and distributed architecture will allow it survive removal from malware whilst hunting down malicious agents and patching systems as necessary. In order to survive in the hostile and dynamic cloud environment, the software incorporates a multi-component and multi-agent architecture which has shown success in the past with malware that propagate in heterogeneous environments. The components consist of passive and active sensors to learn about the environment, distributed storage to provide redundancy and controller/constructor agents for localized coordination. The proposed architecture has been implemented with success and desired results were achieved. The research outputs hold significant potentials, particularly for complex and highly dynamic infrastructures such as those aimed for DigitalQatar.
-
-
-
Watch Health (w-health)
المؤلفون: Mohammed AlotaibiSmart watches have been available for quite some time now. However,with the announcement of"Apple Watch"by Apple Inc., a strong buzz about smart watches has been created. A highly anticipated success of Apple watches can also be linked to an expected increase in the smart watch market in the very near future. Apart from Apple, other big companies such as Sony, Motorola, Nike have their own brand of smart watches. Therefore, a strong market competition would arise leading to competitive prices, technologies, and designs which would possibly lead to increased popularity of smart watches. Following the recent announcement of apple watch, several online and newspaper articles have suggested that the most important application of Apple watch would be in the field of healthcare. This is also backed by the applications available in the Apple watch which includes GPS tracking, gyroscope, accelerometer, pulse monitor, calorie counter, activity tracker, Siri voice tracker, and host of various other applications. Further, the Apple watch is backed by powerful operating systems and hardware processors. This buzz about the smart watches arises one question - How effective can these smart watches be used for providing healthcare solutions? The use of smart devices for healthcare services has been a topic of extensive research for the last decade which has resulted in developing several healthcare solutions for various types of disease management, patient monitoring especially for chronic lifetime diseases and old people. With the emergence of smart watches, now it is time to further explore the possibility of using smart watches for healthcare services. Some of the advantages of smart watches for healthcare services are: ?They are easily wearable and portable and can almost be a part of everyday attire similar to regular watches. ?They are relatively cheaper than other smart devices such as smart mobile phones and similar gadgets. ?With the advancements in hardware and software technologies they are now as powerful as a high-end smart phone and can host several types of applications. ?They can be adapted and customised to provide various disease diagnosis and management according to the individual patient needs. ?The smart watches can include several sensors and also provide a platform for software applications for patient health monitoring and reporting. ?With the use of voice based applications such as Siri, patients having difficulty to use modern gadgets or to read and write can also use the device more easily. There is no doubt that iWatch or other smart watches not only provide numerous possibilities of adapting and implementing existing smart healthcare solutions but also a new platform for developing novel solutions for healthcare. The research and development in the current mobile-health solutions should now also focus their attention towards research in developing smart watches based healthcare solutions. Hence, watch health (w-health) is a division of electronic health practice, which is defined as "a medical practice supported with the use of smart watches technology which includes smart watches, smart mobile phones, wireless devices, patent monitoring devices and many others"
-
-
-
An Enhanced Locking Range 10.5-ghz Divide-by-3 Injection-locked Frequency Divider With Even-harmonic Phase Tuning In 0.18-um Bicmos
المؤلفون: Sanghun Lee, Sunhwan Jang and Cam NguyenA frequency divider is one of the key building blocks in phased-lock loops (PLL) and frequency synthesizers, which are essential subsystems in wireless communication and sensing systems. A frequency divider divides the output frequency of a voltage-controlled oscillator (VCO) in order to compare it with a reference clock signal. Among the different types of frequency dividers, the injection-locked frequency divider (ILFD) is becoming more popular due to its low power and high frequency characteristics. However, the ILFD has an inherent problem of narrow locking range, which basically defines a range over which a frequency-division operation is supported. In order to increase the locking range, one of the most obvious ways is to inject higher power. In fully integrated PLLs, however, the injection signal is supplied by an internal VCO, which typically has limited fixed output power and hence enhancing the locking range with large injection signal power poses difficulty. In this work, we present the development of a fully integrated 10.5-GHz divide-by-3 (1/3) injection-locked frequency divider (ILFD) that can provide extra locking range with a small fixed injection power. The ILFD consists of a previously measured on-chip 10.5-GHz VCO functioning as an injection source, a 1/3 ILFD core, and an output inverter buffer. A phase tuner implemented using an asymmetric inductor is proposed to increase the locking range through even-harmonic (the 2rd harmonic for this design) phase tuning. With a fixed internal injection signal power of only -18 dBm (measured output power of the standalone VCO with a 50-? reference), a 25% enhancement in the locking range from 12 to 15 MHz is achieved with the proposed phase tuning technique without consuming an additional DC power. The frequency tuning range of the integrated 1/3 ILFD is from 3.3 GHz to 4.2 GHz. The proposed 1/3 ILFD is realized in a 0.18-µm BiCMOS process, occupies 0.6 mm × 0.7 mm, and consumes 10.6 mA while the ILFD alone consumes 6.15 mA from 1.8-V supply. The main objective of this work is proposing a new technique of phase-tuning of the even-harmonics that can "further" increase the locking range with an "extra" amount beyond what can be achieved by other techniques. Since the developed technique can enhance the locking range further at a fixed injection power, it can be used in conjunction with other techniques for further enhancing the locking range. For instance, the locking range can be increased by using different injection powers and then further enhanced by tuning the phase of the even harmonics at each power level. The "extra" locking range, not achievable without the even-harmonic phase tuning, amounts to 25%, which is very attractive for PLL applications. Furthermore, additional tuning mechanisms such as use of a capacitor bank can be employed to achieve even wider tuning range for applications such as PLL.
-
-
-
Exprimental Study Of Mppt Algorithms For Pv Solar Water Pumping Applications
المؤلفون: Badii GmatiThe energy utilization efficiency of commercial photovoltaic (PV) pumping systems can be significantly improved by employing many MPPT methods available in the literature such as the constant voltage, short-Current Pulse, Open Voltage, Perturb and Observe, Incremental-Conductance and non-linear methods (Fuzzy Logic and Neural Networks). This paper presents a detailed experimental study of the two DSP implementation techniques: Constant Voltage and Perturb and Observe "P&O" used for standalone PV pumping systems. The influence of algorithm parameters on system behavior is investigated and the various advantages and disadvantages of each technique are identified for different weather conditions. Practical results obtained using dSpace DS1104 show excellent performance and optimal operating system is attained regardless of climate change.
-
-
-
Geometrical Modeling And Kinematic Analysis Of Articulated Tooltips Of A Surgical Robot
INTRODUCTION: The advent of da Vinci surgical robot (Intuitive Surgical, California, USA) has allowed complex surgical procedures in urology, gynecology, cardiothoracic, and pediatric to be performed with better clinical outcomes. The end effectors of these robots exhibits enhanced dexterity with improved range of motion leading to better access and precise control during the surgery. Understanding the design and kinematics of these end effectors (imitating surgical instruments' tooltips) would assist in replication of their complex motion in a computer-generated environment. This would further support the development of computer-aided robotic surgical applications. The aim of this work is to develop a software framework comprising of the geometric three-dimensional models of the surgical robot tool-tips along with their kinematic analysis. METHODS: The geometric models of the surgical tooltips were designed based on the EndoWristTM instruments of the da Vinci surgical robot. Shapes of the link and inter-link distances of the EndoWristTM instrument were measured in detail. A three-dimensional virtual model was then recreated using CAD software (Solidworks, Dassault Systems, Massachusetts, USA). The kinematic analysis was performed considering trocar as the base-frame for actuation. The actuation mechanism of the tool composed of a prismatic joint (T1) followed by four revolute joints (Ri ; i = 1 to 4) in tandem (Figure 1). The relationship between the consequent joints was expressed in form of transformation matrices using Denavit-Hartenberg (D-H) convention. Equations corresponding to the forward and the inverse kinematics were then computed using D-H parameters and applying geometrical approach. The kinematics equations of the designed tooltips were implemented through a modular cross-platform software framework developed using C/C++. In the software, the graphical rendering was performed using openGL and a multi-threaded environment was implemented using Boost libraries. RESULTS AND DISCUSSION: Five geometric models simulating the articulated motion of the EndoWristTM instruments were designed (Figure 2). These models were selected based on the five basic interactions of the surgical tooltip with the anatomical structures, which included: Cauterization of the tissue, Stitching using needles, Applying clips on vascular structures, Cutting using scissors, and Grasping of the tissues. The developed software framework, which includes kinematics computation and graphical rendering of the designed components, was evaluated for applicability in two scenarios (Figure 3). The first scenario demonstrates the integration of the software with a patient-specific simulator for pre-operative surgical rehearsal and planning (Figure 3a). The second scenario shows the applicability of the software in generation of virtual overlays of the tooltips superimposed with the stereoscopic video stream and rendered on the surgeon's console of the surgical robot (Figure 3b). This would further assist in development of vision-based guidance for the tooltips. CONCLUSION: The geometrical modeling and kinematic analysis allowed the generation of the motion of the tooltips in a virtual space that could be used for both pre-operatively and intra-operatively, before and during the surgery, respectively. The resulting framework can also be used to simulate and test new tooltip designs.
-
-
-
Relate-me: Making News More Relevant
المؤلفون: Tamim Jabban and Ingmar WeberTo get readers of international news stories interested and engaged, it is important to show how a piece of far-away news relates to them and how it might affect their own country. As a step in this direction, we have developed a tool to automatically create textual relations between news articles and readers. To produce such connections, the first step is to detect the countries mentioned in the article. Many news sites, including Al Jazeera (http://www.aljazeera.com), use automated tools such as OpenCalais (http://opencalais.com) to detect place references in a news article and list those as a list of countries in a dedicated section at the bottom (say, List A: [Syria, Germany]). If not already included, relevant countries could be detected using existing tools and dictionaries. The second step is to use the reader's IP address to infer the country they are currently located in (say, Country B: Qatar). Knowing this country gives us a "bridge" to the reader as now we can try relate the countries from List A, to the reader's country, Country B. Finally, we have to decide which type of contextual bridges to build between the pairs of countries. Currently, we are focusing on four aspects: 1) Imports & Exports: this section displays imports and exports between the Country B and List A, if any. For instance: "Qatar exports products worth $272m every year to Germany, 0.27% of total exports." Upon clicking on this information, it will redirect the user to another website, showing a breakdown of these imports and exports. 2) Distances: this simply states the direct distance in kilometers from Country B to every country in List A. For instance: "Syria is 2,110km away from Qatar." Upon clicking on this information, it will navigate to a Google Maps display, showing this distance. 3) Relations: this provides a link to the designated Wikipedia page between Country B and every country in List A. For instance, "Germany - Qatar Relations (Wikipedia)." It also shows a link relating the countries using Google News: "More on Qatar - Germany (Google News)" 4) Currencies: this shows the currency conversions between Country B's currency and every other country in List A, for instance: "1 QAR = 0.21 EUR (Germany's currency)." Our current tool, which will be demonstrated live during the poster presentation, was built using JavaScript. With the use of tools such as Greasemonkey, this allowed us to test and display the results of the project on Al Jazeera (http://www.aljazeera.com), without having site ownership. We believe that the problem of making connections between countries explicit is of particular relevance to small countries such as Qatar. Whereas usually a user from, say, Switzerland, might not be interested in events in Qatar, showing information regarding trade between the two could change their mind.
-
-
-
Energy And Spectrally Efficient Solutions For Cognitive Wireless Networks
المؤلفون: Zied Bouida, Ali Ghrayeb and Khalid QaraqeAlthough different spectrum bands are allocated to specific services, it has been identified that these bands are unoccupied or partially used most of the time. Indeed, recent studies show that 70% of the allocated spectrum is not utilized. As wireless communication systems evolve, an efficient spectrum management solution is required in order to satisfy the need of current spectrum-greedy applications. In this context, cognitive radio (CR) has been proposed as a promising solution to optimize the spectrum utilization. Under the umbrella of cognitive radio, spectrum-sharing systems allow different wireless communication systems to coexist and cooperate in order to increase their spectral efficiency. In these spectrum-sharing systems, primary (licensed) users and secondary (unlicensed) users are allowed to coexist in the same frequency spectrum and transmit simultaneously as long as the interference of the secondary user to the primary user stays below a predetermined threshold. Several techniques have been proposed in order to meet the required quality of service of the secondary user while respecting the primary user's constraints. While these techniques, including multiple-input multiple-output (MIMO) solutions, are optimized from a spectrally-efficiency perspective, they are generally not well designed to address the related complexity and power consumption issues. Thus, the achievement of high data rates with these techniques comes at the expense of high-energy consumption and increased system complexity. Due to these challenges, a trade-off between spectral and energy efficiencies has to be considered in the design of future transmission technologies. In this context, we have recently introduced adaptive spatial modulation (ASM), which comprises both adaptive modulation (AM) and spatial modulation (SM), with the aim of enhancing the average spectral efficiency (ASE) of multiple antenna systems. This technique was shown to offer high energy efficiency and low system complexity thanks to the use of SM while achieving high data rates thanks to the use of AM. Motivated by this technique and the need of such performance in a CR scenario, we study in this abstract the concept of ASM in spectrum sharing systems. In this work, we propose the ASM-CR scheme as an energy-efficient, spectrally-efficient, and low-complexity scheme for spectrum sharing systems. The performance of the proposed scheme is analyzed in terms of ASE and average bit error rate and confirmed with selected numerical results using Monte-Carlo simulations. These results confirm that the use of such techniques comes with an improvement in terms of spectral efficiency, energy efficiency, and overall system complexity.
-
-
-
Spca: Scalable Principle Component Analysis For Big Data
المؤلفون: Siddharth MalhotrasPCA: Scalable Principle Component Analysis for Big Data Web sites, social networks, sensors, and scientific experiments today generate massive amounts of data i.e Big Data. Owners of this data strive to obtain insights from it, often by applying machine learning algorithms.Thus, designing scalable machine learning algorithms that run on a cloud computing infrastructure is an important area of research. Many of these algorithms use the MapReduce programming model. In this poster presentation, we show that MapReduce machine learning algorithms often face scalability bottlenecks, commonly because the distributed MapReduce algorithms for linear algebra do not scale well. We identify several optimizations that are crucial for scaling various machine learning algorithms in distributed settings. We apply these optimizations to the popular Principal Component Analysis (PCA) algorithm. PCA is an important tool in many areas including image processing, data visualization, information retrieval, and dimensionality reduction. We refer to the proposed optimized PCA algorithm as sPCA. sPCA is implemented in the MapReduce framework. It achieves scalability via employing efficient large matrix operations, effectively leveraging matrix sparsity, and minimizing intermediate data.Experiments show that sPCA outperforms the PCA implementation in the popular Mahout machine learning library by wide margins in terms of accuracy, running time, and volume of intermediate data generated during the computation. For example, on a 94 GB dataset of tweets from Twitter, sPCA achieves almost 100% accuracy and it terminates in less than 10,000 s (about 2.8 hours), whereas the accuracy of Mahout PCA can only reach up to 70% after running for more than 259,000 s (about 3 days). In addition,both sPCA and Mahout PCA are iterative algorithms, where the accuracy improves by running more iterations until a target accuracy is achieved. In our experiments, when we fix the target accuracy at 95%, Mahout PCA takes at least two orders of magnitudes longer than sPCA to achieve that target accuracy. Furthermore, Mahout PCA generates about 961 GB of intermediate data, whereas sPCA produces about 131 MB of such data, which is a factor of 3,511x less data. This means that, compared to Mahout PCA, sPCA can achieve more than three orders of magnitudes saving in network and I/O operations, which enables sPCA to scale well.
-
-
-
How To Improve The Health Care System By Predicting The Next Year Hospitalization
المؤلفون: Dhoha AbidIt is very common to study the patient's data hospitalization to get useful information to improve the health care system. According to the American Hospital Association, in 2006, over $30 billion was spent on unnecessary hospital admissions. If patients that are likely to be hospitalized can be identified, the admission will be avoided as they will get the necessary treatments earlier. In this context, in 2013, the Heritage Provider Network (HPN) launched the $3 million Heritage Health Prize in order to develop a system that uses the available patient data (health records and claims)to predict and avoid unnecessary hospitalizations. In this work we take this competition data, and we try to predict the patient's hospitalization number. The data encompasses more than 2,000,000 of patient admission records over three years. The aim is to use the data of the ?rst and second year to predict the number of hospitalization of the third year. In this context, a set of operations mainly: data transformation, outlier detection, clustering, and regression algorithms are applied. Data transformation operations are mainly: (1) As the data is big enough to be processed, dividing the data into chunks is mandatory. (2) Missing values are either replaced or removed. (3) As the data is raw and cannot be labeled, different operations of aggregation are applied. After transforming the data, outlier detection, clustering, and regression algorithms are applied in order to predict the third year hospitalization number for each patient. Results show, by applying directly regression algorithms, the relative error is only 79%. However, by applying the DBSCAN clustering algorithm followed by the regression algorithm, the relative error decreased to be 67%. This is because the attribute that has been generated by the pre-processing clustering step helped the regression algorithm to predict more accurately the number of hospitalization; and this is why the relative error has dropped. The relative error can be decreased more if we apply the clustering pre-processing step twice. That means, the clusters generated in the first clustering step are re-clustered to generated sub-clusters. Then, the regression algorithm is applied to these sub-clusters. The relative error dropped significantly from 67% to 32%. Patients share common hospitalization history are grouped into clusters. This clustering information is used to enhance the regression results.
-