- Home
- Conference Proceedings
- Qatar Foundation Annual Research Conference Proceedings
- Conference Proceeding
Qatar Foundation Annual Research Conference Proceedings Volume 2018 Issue 3
- Conference date: 19-20 Mar 2018
- Location: Qatar National Convention Center (QNCC), Doha, Qatar
- Volume number: 2018
- Published: 15 March 2018
77 results
-
-
Towards a multimodal classification of cultural heritage
Authors: Abdelhak Belhi and Abdelaziz BourasThe humanity has always learned from the previous experiences and thus for many reasons. The national heritage proves to be a great way to discover and access a nation's history. As a result, these priceless cultural items have a special attention and requires a special care. However, Since the wide adoption of new digital technologies, documenting, storing, and exhibiting cultural heritage assets became more affordable and reliable. These digital records are then used in several applications. Researchers saw the opportunity to use digital heritage recordings for virtual exhibitions, links discoveries and for long-term preservation. Unfortunately, there are many under looked cultural assets due to missing history or missing information (metadata). As a classic solution for labeling these assets, heritage institutions often refer to cultural heritage specialists. These institutions are often shipping their valuable assets to these specialists, and need to wait few months even years to hopefully get an answer. This can in fact have multiple risks such as the loss or damage of these valuable assets. This in fact is a big concern for heritage institutions all around the world. Recent studies are reporting that only 10 percent of the world heritage is exhibited in museums. The rest (90%) is deposited and stored in museum archives especially because of their damage or the lack of metadata. With a deep analysis of the current situation, our team did a survey of current state-of-the-art technologies that can overcome this problem. As a result, new machine learning and deep learning techniques such as Convolutional Neural Networks (CNN) are making a radical change in image and bigdata classification. In fact, all the big technology companies such as Google, Apple and Microsoft are pushing the use of these artificial techniques to explore their astronomic databases and repositories in order to better serve their users. In this contribution, we are presenting a classification approach aiming at playing the role of a digital cultural heritage expert using a machine learning model and deep learning techniques. The system has mainly two stages. The first stage which is the so-called “the learning stage” where the system receives as input a large dataset of labeled data. This data is mainly images of different cultural heritage assets organized in categories. This is a very critical step as the data must be very descriptive and coherent. The next stage is in fact the classification stage where the system receives an image of an unlabeled asset and then tries to extract the relevant visual features of the image such as shapes, edges, colors and fine details such as text. The system then analyses these features and predicts the missing metadata. These data can be the category, the year, the region etc. the first tests are actually giving promising results. Our team is aiming to further improve these results using a multimodal machine learning model. In fact, these models rely on multiple learning sources (text, videos, sound recordings, images) at the same time. The research progress shows that this technique is giving very accurate predictions.
-
-
-
Education/Industry Collaboration Modeling: An Ontological Approach
Authors: Houssem Gasmi and Abdelaziz BourasOne of the main suppliers of the workforce for the engineering industry and the economy in general is the higher education sector. The higher education sector is consistently being challenged by the fast evolving industry and hence it is under constant pressure to fulfill the industry's ever-changing needs. It needs to adapt its academic curricula to supply the industry with students who have update-to-date and relevant competencies. Nevertheless, the gap still exists between what education curricula offer and the skills that are actually needed by the industry. Therefore, it is crucial to find an efficient solution to bridge the gap between the two worlds. Bridging the gap helps the industry to cut the costs of training university graduates and assists in higher education advancement. In response to these issues, competency-based education was developed. It was first developed in the United States, in response to the growing criticisms towards traditional education that was seen as more and more disconnected from the societal evolutions, especially changes within the industry. Despite some criticisms, the competency-based education pedagogical approach has been employed by several western countries to improve their upper-secondary vocational curricula. In recent times, it started to be more and more adapted to the higher education, as a way to update and improve academic courses. In this research work, a semantic ontological model is presented to model the competencies in the domains of education and the industry. It illustrates the use of ontologies for three identified end users: Employers, Educators, and Students. Ontologies are best used to solve problems of interoperability between different domains, provide a shared understating of terms across domains and are helpful in avoiding the wasted effort when translating terminology between domains. They also provide opportunities for domain knowledge reuse in different contexts and applications. Ontologies can also act as unifying framework between software systems and eliminates interoperability and communication issues that are faced when trying to translate concepts between systems. The scope of the research work is to build an ontology representing the domain concepts and validate it by building competency models. The competencies from the domains of education and industry are defined and classified in the ontology. Then by using a set of logical rules and a semantic reasoner, we would be able to analyze the gap between the different education and industry profiles. We will propose different scenarios on how the developed ontology could be used to build competency models. This paper describes how ontologies could be a relevant tool for an initial analysis and focus on the assessment of the competencies needed by the engineering market as a case study. The research questions this work investigates are: 1) Are semantic web ontologies are the best solution to model the domain and analyze the gap between industry needs and higher education? 2) What ontology design approaches are more suitable for representing the competencies model? 3) What are the limitations of ontology modeling? Two main limitations are discussed: The Open World Assumption and the Semantic Reasoner limitations. This work is part of the Qatar Foundation NPRP Pro-Skima research project.
-
-
-
Framework of experiential learning to enhance student engineering skill
Authors: Fadi Ghemri, Abdelaziz Bouras and Houssem GasmiIn this research work, we propose a framework of experiential learning to enhance student work skills and experience. This research main to contribute to the development and expansion of local industry, through the conduct of long-term fundamental research that contributes to the science base and understanding needs of national economy, through industrial by providing an adapted method to enhance the teaching contents and pedagogical organization to be more accurate and responding to the competency requirements of local employers. A vocational approach is a complicated process for universities since it is necessary to take into account a multiplicity of variables to establish a compromise between companies requirement and students competencies acquired during the university training. Academics expert (Teachers, researchers) should be careful to design the curriculum to balance between theory and practice in order to respond to workplace requirement, to bridge the gap and adequately prepare the students for the market. Such complexity requires close and continuous collaboration between industry and academia to build innovative solutions and develop new skills and competencies. Higher educational institutions need to reflect such an evolution in their training curricula. Trained students will be able to tackle real-world challenges efficiently. Collaboration approaches at the undergraduate and graduate levels between industry and academia showing how such collaborations increased the efficiency and effectiveness of the hired graduates and increased their employability. In terms of competent graduates, the elaboration of a competence-oriented curriculum and the respective implementation and organization are crucial, this method is based on cooperative and oriented learning, this approach needs an exchange between those responsible of the content and the industrial, this exchange must lead to a mutual understanding needs. To implement this strategy in Qatar context, we associate various types of research tools; collecting data regarding a several ethnographic perspective, local economic data, observations of different surrounding universities and local companies, interviews with academics and professional experts, etc. We also initiate some meeting in university with industrial and academic experts; indeed, we organized lately two workshops, during the first one, representatives from companies (Ooredoo, QAPCO, Qatar Airways, and IBM. Etc.) and academic experts, underlined the competency needs in IT field, especially the cyber-security subfield. The experts attest that it is crucial to have a local qualified labor to respond to a large increase of digital-based-economy and to reduce the dependency of expatriate experts; they also highlighted the importance of collaboration between university and industrial through the integrating internship during the master course curriculum. The last workshop was focused on Alumni; we estimated that their opinion is highly important because they are in the right position to give feedback about their experience regarding the adequacy of their academic training in their current occupations and to identify possible challenges and gaps they may have faced when they joined their workplaces. During the session, the list of industry requirements that was the output of the second workshop was further discussed and refined with Alumni and our industry, all this action are in order to involve the industrial as stakeholders and engage them in our perspective and to build new cooperative learning curricula. The establishments of an ecosystem where different stakeholders from Qatar industry and academia help us to contribute and discuss the pedagogical evolution of local education, particularly in IT and computing education, such a dialog is important for the evolution of the pedagogical curricula to adequately prepare future graduates for the challenges of the real world. Elaboration of the curriculum together with professionals from the business side can guarantee the update of the education method within the content of the curriculum. Modular structure with involvement in a company and a permanent interaction with the business can facilitate together, with up to date teaching methods. The results of the workshop and the discussion explained in more detail in the proposed poster. This work is part of the Qatar Foundation NPRP Pro-Skima research project.
-
-
-
UAVbased flying IoT Gateway for Delay Tolerant Data Transfer
Authors: Hamid Menouar, Ahmed Abuzrara and Nour AlsahanMany statistics show the number of connected devices (Internet of Things; IoT) worldwide to grow drastically in the next few years, to exceed 30 billion just by 2020. All these devices need to be connected to the internet, to establish tow ways communication with the backend applications. The list of applications and services that can be enabled by using IoT can be endless, covering different areas such as smart cities, smart home, connected vehicles, intelligent transport systems, agriculture, air and weather monitoring, industry 4.0, etc. One of the fundamental requirements of an IoT device is to be connected and reachable at any time. But when we look at the different applications that run on top of IoT, many of them do not require a continuous connection to the internet. For example, the air monitoring IoT devices, normally they sense and report data only once every 15 to 60 minutes. Such devices would not require a continuous connection to the Internet, but only every 15 to 60 minutes. For such devices and use-cases, we propose to use a flying IoT gateway, that can come next to the sensor every e.g. 15 minutes, to take the data that has been collected by the sensor and carry it to the backend. In this contribution we present a prototype of a solution that uses unmanned aerial vehicles (UAVs), aka drones, to provide a delay tolerant data routing solution for IoT devices. In this solution, a drone flies over a set of deployed IoT devices, to retrieve the collected and stored data and then deliver it to the backed. Such a solution can be suitable for sensing devices that do not require a real-time communication, like traffic speed cams. Indeed, the speed cams can collect data and store it locally, until a drone comes to carry and transfer it to the backend.This solution helps does not only reduce the overall cost by eliminating the cost of the Internet connectivity at each IoT device, but it also reduce the security vulnerability as the devices are physically not connected to the internet all the time, nor directly.This work has been been conducted under the R&D project NPRP9-257-1-056 which is funded and supported by QNRF.
-
-
-
Measurement and Analysis of Bitcoin Transactions of Ransomware
Authors: Husam Basil Al Jawaheri, Mashael Al Sabah and Yazan BoshmafRecently, more than 100,000 cases for ransomware attacks were reported in the Middle East, Turkey and Africa region [2]. Ransomware is a malware category that limits the access of users to their files by encrypting them. This malware requires victims to pay in order to get access to the decryption keys. In order to remain anonymous, ransomware requires victims to pay through the Bitcoin network. However, due to an inherent weakness in Bitcoin's anonymity model, it is possible to link identities hidden behind Bitcoin addresses by analyzing the blockchain, Bitcoin's public ledger where all of the history of transactions is stored. In this work, we investigate the feasibility of linking users, as identities represented by Bitcoin's public addresses, to addresses owned by entities operating ransomware. To demonstrate how such linking is possible, we crawled BitcoinTalk, a famous forum for Bitcoin related discussions, and a subset of Twitter public datasets. Out of nearly 5B tweets and 1M forum pages, we found 4.2K and 41K unique online identities, respectively, along with their public personal information and Bitcoin addresses. Then we expanded these datasets of users by using closure analysis, where a Bitcoin address is used to identify a set of other addresses that are highly likely to be controlled by the same user. This allowed us to collect thousands more Bitcoin addresses for the users. By analyzing transactions in the blockchain, we were able to link 6 unique identities to different ransomware operators including CryptoWall [1] and WannaCry [3]. Moreover, in order to get insights into the economy and activity of these Ransomware addresses, we analyzed the money flow of these addresses along with the timestamps associated with transactions involving them. We observed that ransomware addresses were active from 2014 to 2017, with an average lifetime of nearly 62 days. While some addresses were only active during a certain year, others were operating for more than 3 years. We also observed that the revenue of these malware exceeds USD 6M for CryptoWall, and ranges from USD 3.8K to USD 700K for ransomware such as WannaCry and CryptoLocker, with an average number of transactions of nearly 52. One address associated with CryptoLocker ransomware also had a large amount of Bitcoins worth more than USD 34M at the time of writing. Finally, we believe that such type of analysis can potentially be used as a forensic tool to investigate ransomware attacks and possibly help authorities trace the roots of such malware. 1- «Ransom Cryptowall.» Symantec. June 14, 2014. Accessed November 01, 2017. https://www.symantec.com/security_response/writeup.jsp?docid = 2014-061923-2824-99.2- Varghese, Joseph. «Ransomware could be deadly, cyber security expert warns.» Gulf Times. May 05, 2017. Accessed November 01, 2017. http://www.gulf times.com/story/546937/Ransomware-could-be-deadly-cyber-security-expert-w.3- Woollaston, Victoria. «WannaCry ransomware: what is it and how to protect yourself.» WIRED. June 28, 2017. Accessed November 01, 2017. http://www.wired.co.uk/article/wannacry-ransomware-virus-patch.
-
-
-
Artificial Intelligence and Social Media to Aid Disaster Response and Management
Authors: Muhammad Imran, Firoj Alam, Ferda Ofli and Michael AupetitExtended AbstractPeople increasingly use social media such as Facebook and Twitter during disasters and emergencies. Research studies have demonstrated the usefulness of social media information for a number of humanitarian relief operations ranging from situational awareness to actionable information extraction. Moreover, the use of social media platforms during sudden-onset disasters could potentially bridge the information scarcity issue, especially in the early hours when few other information sources are available. In this work, we analyzed Twitter content (textual messages and images) posted during the recent devastating hurricanes namely Harvey and Maria. We employed state of the art artificial intelligence techniques to process millions of textual messages and images shared on Twitter to understand the types of information available on social media and how emergency response organizations can leverage this information to aid their relief operations. Furthermore, we employed deep neural networks techniques to analyze the imagery content to assess the severity of damage shown in the images. Damage severity assessment is one of the core tasks for many humanitarian organization.To perform data collection and analysis, we employed our Artificial Intelligence for Digital Response (AIDR) technology. AIDR combines human computation and machine learning techniques to train machine learning models specialized to fulfill specific information needs of humanitarian organizations. Many humanitarian organizations such as UN OCHA, UNICEF have used the AIDR technology during many major disasters in the past including the 2015 Nepal earthquake, the 2014 typhoon Hagupit and typhoon Ruby, among others. Next, we provide a brief overview of our analysis during the two aforementioned hurricanes.Hurricane Harvey Case StudyHurricane Harvey was an extremely devastating storm that made landfall to Port Aransas and Port O'Connor, Texas, in the United States on August 24-25, 2017. We collected and analyzed around 4 million Twitter messages to determine how many of these messages are, for example, reporting some kind of infrastructure damage, or reports of injured or dead people, missing or found people, displacements and evacuation, donation and volunteers reports. Furthermore, we also analyzed geotagged tweets to determine the types of information originate from the disaster-hit areas compared to neighboring areas. For instance, we generated maps of different cities in the US in and around the hurricane hit areas. Figure 1 shows the map of geotagged tweets reporting different types of useful information from Florida, USA. According to the results obtained from the AIDR classifiers, both caution and advice and sympathy and support categories are more prominent than other informational categories such as donation and volunteering. In addition to the textual content processing of the collected tweets, we perform automatic image processing to collect and analyze imagery content posted on Twitter during Hurricane Harvey. For this purpose, we employ state-of-the-art deep learning techniques. One of the classifiers deployed in this case was the damage-level assessment. The damage-level assessment task aims to predict the level of damage in one out of three damage levels i.e., SEVERE damage, MILD damage, and NO damage. Our analysis revealed that most of the images (∼86%) do not contain any damage signs or considered irrelevant containing advertisements, cartoons, banners, and other irrelevant content. Of the remaining set, 10% of the images contain MILD damage, and only ∼4% of them show SEVERE damage. However, finding these 10% (MILD) or 4% (SEVERE) useful images is like finding a needle in a giant haystack. Artificial intelligence techniques such as employed by the AIDR platform are hugely useful to overcome such information overload issues and help decision-makers to process large amounts of data in a timely manner.Fig. 1: Geotagged tweets from Florida, USA.Hurricane Maria Case StudyAn even more devastating hurricane than the Harvey that hit Puerto Rico and nearby areas was hurricane Maria. Damaged roofs, uprooted trees, widespread flooding were among the scenes on the path of Hurricane Maria, a Category 5 hurricane that slammed Dominica and Puerto Rico and has caused at least 78 deaths including 30 in Dominica and 34 in Puerto Rico, and many more left without homes, electricity, food, and drinking water.We activated AIDR on September 20, 2017 to collect tweets related to Hurricane Maria. More than 2 million tweets were collected. Figure 2 shows the distribution of the daily tweet counts. To understand what these tweets are about, we applied our tweet text classifier which was originally trained (F1 = 0.64) on more than 30k human-labeled tweets from a number of past disasters. AIDR's image processing pipeline was also activated to identify images that show infrastructure damage due to Hurricane Maria. Around 80k tweets contained images. However, ∼75% of these images were duplicate. The remaining 25% (∼20k) images were automatically classified by the AIDR's damage assessment classifier into three classes as before. Figure 2: Tweets count per dayWe believe that more information can be extracted from image about the devastation caused by the disaster than relying solely on the textual content provided by the users. Even though it is in the testing phase, our image processing pipeline does a decent job in identifying images that show MILD or SEVERE damage. Instead of trying to look at all the images, humanitarian organizations and emergency responders can simply take a look at the retained set of MILD or SEVERE damage images to get a quick sense of the level of destruction incurred by the disaster.
-
-
-
Fault Tolerant Control of Multiple Mobile Robots
Authors: Nader Meskin and Parisa YazdjerdiRecently, usage of autonomous wheeled mobile robots (WMRs) is significantly increased in different industries such as manufacturing, health care and military and there exist stringent requirements for their safe and reliable operation in industrial/commercial environments. In addition, autonomous multi-agent mobile robot systems in which specific numbers of robots are cooperating with each other to accomplish a task is becoming more demanding in different industries in the age of technology enhancement. Consequently, development of fault tolerant controller (FTC) for WMRs is a vital research problem to be addressed in order to enhance the safety and reliability of mobile robots. The main aim of this paper is to develop an actuator fault tolerant controller for both single and multiple-mobile robot applications with the main focus on differential derive mobile robots. Initially, a fault tolerant controller is developed for loss of effectiveness actuator faults in differential drive mobile robots while tracking a desired trajectory. The heading and position of the differential drive mobile robot is controlled through angular velocity of left and right wheels. The actuator loss of effectiveness fault is modeled on the kinematic equation of the robot as a multiplicative gain in the left and right wheels angular velocity. Accordingly, the aim is to estimate the described gains using joint parameter and state estimation framework. Toward this goal, the augmented discrete time nonlinear model of the robot is considered. Based on the extended Kalman filter technique, a joint parameter and state estimation method is used to estimate the actuator loss of effectiveness gains as the parameters of the system, as well as the states of the system. The estimated gains are then used in the controller to compensate the effect of actuator faults on the performance of mobile robots. In addition, the proposed FTC method is extended for the leader-follower formation control of mobile robots in the presence of fault in either leader or followers. Multi agent mobile robot system is designed to track a trajectory while keeping a desired formation in the presence of actuator loss of effectiveness faults. It is assumed that the leader controller is independent from the followers and is designed based on the FTC frame work developed earlier in this document. Also, the fault is modeled in the kinematic equation of the robot as a multiplicative gain and augmented discrete-time nonlinear model is used to estimate the loss of effectiveness gains. The follower controller is designed based on feedback linearization approach with respect to the coordinates of the leader robot. An extended Kalman filter is used for each robot to estimate parameters and states of the system and as the fault is detected in any of the followers, the corresponding controller compensates the fault. Finally, the efficacy of the proposed FTC framework for both single and multiple mobile robots is demonstrated by experimental results using Qbot-2 from Quanser. To sum up, a fault tolerant controller scheme is proposed for differential drive mobile robots in the presence of loss of effectiveness actuator faults. A joint parameter and state estimation scheme is utilized based on EKF approach to estimate parameters (actuator loss of effectiveness) and the system states. The effect of the estimated fault is compensated in the controller for both single robot and formation control of multiple mobile robots. The proposed schemes are experimentally validated on Qbot-2 robots.
-
-
-
Wearable V2X solution for Children and Vulnerable Road Users Safety on the Road
Authors: Hamid Menouar, Nour Alsahan and Mouhamed Ben BrahimAccording to Energy Technology Perspectives 2014, there are approximately 900 million light duty vehicles (not counting two- and three-wheelers) today, and that number will be doubled by 2050 to reach 2 billion. Such a considerable increase will bring-in further challenges for road safety and traffic efficiency. Motor vehicle crashes are the leading cause of death for children and young adults in the United States, with an annual death toll of 33,000 and over 2.3 million people injured. Those figures are representative of the challenge not only in the US but also in other regions including Qatar and the golf region. Vehicle to Vehicle and Vehicle to Infrastructure (V2X) communication technology, which is heading our roads in few years, is seen as a great solution to reduce number of accidents on the road, and it is considered as an enabler of next generation of road safety and Intelligent Transport System (ITS). V2X communication is not limited to vehicle-to-vehicle and vehicle-to-infrastructure, it is also meant to be used for vehicle-to-bicycle and even vehicle-to-pedestrian communication. Indeed, by enabling real-time and fast communication between vehicles and vulnerable road users such as bicycles and road users, we may make the road much safer for those vulnerable road users. This is one of the use cases we would like to enable and test in the Qatar V2X Field Operational Field (Qatar V2X FOT) which is supported and funded by Qatar National Research Fund (QNRF). Enabling vulnerable road-users such as bicycles and pedestrians with V2X capabilities has many challenges. The main challenge is the energy consumption, as V2X operates in 5.9GHz radio band, which consumes relatively high energy. Therefore, to operate V2X on those vulnerable users, especially the pedestrian, the user needs to hold a battery which requires to be regularly recharged. We came out with a solution to this problem, which reduces the energy consumption of the V2X device to a level it can operate a small battery suitable to be a wearable device. This poster will expose the challenges when using V2X on vulnerable road users, especially pedestrians, and it will present solution and the related prototype that is developed and tested within the NPRP project. The solution and prototype presented in this poster are the outcomes of the research project NPRP 8-2459-1-482 which is supported and funded by QNRF.
-
-
-
Evaluation of Hardware Accelerators for Lattice Boltzmann based Aneurysm Blood Flow Measurement
Clipping is a potential treatment for ruptured/unruptured brain aneurysm patients. In order to determine the profitable treatment and also clip's location, surgeons need to have the measurements such as velocity and blood pressure in and around the aneurysm.Typically, simulating the blood fluid and finding the corresponding measurements require heavy computation resources. The Lattice Boltzmann (LB) method is the conventional way to simulate the fluid dynamics and HemeLB is an open-source computational suite for 3D fluid dynamics simulations of blood flow in the vasculature of the human body. In this work, we aim to evaluate the hardware acceleration of LB and HemeLB on reconfigurable system on chip (SoC) and high performance computing (HPC) machine using RAAD platform.
-
-
-
Effective Realtime Tweet Summarization
Authors: Reem Suwaileh and Tamer ElsayedTwitter has been developed as an immense information creation and sharing network through which users post information. Information could vary from the world»s breaking news to other topics such as sports, science, religion, and even personal daily updates. Although a user would regularly check her Twitter timeline to stay up-to-date on her topics of interest, it is impossible to cope with manual tracking of those topics while tackling the challenges that emerge from the Twitter timeline nature. Among these challenges are the big volume of posted tweets (about 500M tweets are posted daily), noise (e.g., spam), redundant information (e.g., tweets of similar content), and the rapid development of topics over time. This necessitates the development of real-time summarization systems (RTS) that automatically track a set of predefined interest profiles (representing the users» topics of interest) and summarize the stream while considering the relevance, novelty, and freshness of the selected tweets. For instance, if a user is interested in following the updates on the “GCC crises», the system should efficiently monitor the stream and capture the on-topic tweets including all aspects of the topic (e.g., official statements, interviews and new claims against Qatar) which change over time. Accordingly, real-time summarization approaches should use simple and efficient approaches that can scale to follow multiple interest profiles simultaneously. In this work, we tackle such problem by proposing RTS system that adopts a lightweight and conservative filtering strategy. Given a set of user interest profiles, the system tracks those profiles over Twitter continuous live stream in a scalable manner in a pipeline of multiple phases including pre-qualification, preprocessing, indexing, relevance filtering, novelty filtering, and tweets nomination. In prequalification phase, the system filters out non-English and low-quality tweets (i.e., tweets that are too short or including many hashtags). Once a tweet is qualified, the system preprocesses it in a series of steps (e.g., removing special characters) that aim at preparing the tweet for relevance and novelty filters. The system adopts a vector space model where both interest profiles and incoming tweets are represented as vectors constructed using idf-based term weighting. An incoming tweet is scored for relevance against the interest profiles using the standard cosine similarity. If the relevance score of a tweet exceeds a predefined threshold, the system adds the tweet to the potentially-relevant tweets for the corresponding profile. The system then measures the novelty of the potentially-relevant tweet by computing its lexical overlap with the already-pushed tweets using a modified version of Jaccard similarity. A tweet is considered novel if the overlap does not exceed a predefined threshold. This way the system does not overwhelm the user with redundant notifications. Finally, the list of potentially-relevant and novel tweets of each profile is re-ranked periodically based on both relevance and freshness and the top tweet is then pushed to the user; that ensures the user will not be overwhelmed with excessive notifications while getting fresh updates. The system also allows the expansion of the profiles over time (by automatically adding potentially-relevant terms) and the dynamic change of the thresholds to adapt to the change in the topics over time. We conducted extensive experiments over multiple standard test collections that are specifically developed to evaluate RTS systems. Our live experiments on tracking more than 50 topics over a large stream of tweets lasted for 10 days show both effectiveness and scalability of our system. Indeed, our system exhibited the best performance among 19 international research teams from all over the world in a research track organized by NIST institute (in the United States) last year.
-
-
-
A Reconfigurable Multipurpose SoC Mobile Platform for metal detection
Authors: Omran Al Rshid Abazeed, Naram Mhaisen, Youssef Al-Hariri, Naveed Nawaz and Abbes AmiraBackground and Objectives One of the key problems in mobile robotics is the ability to understand and analyze the surrounding environment in a useful way. This is especially important in dangerous applications where human involvement should be avoided. A clear example of employing the robots in dangerous applications is mine detection which is mostly done through metal detection techniques. Among the various types of walking robots, Hexapod walking robots offer a good static stability margin and faster movement especially in rough terrain applications [1] Thus, the “Hexapod Terasic Spider Robot” is a suitable platform for the metal detection purpose especially that it is equipped with Altera DE0-Nano field programmable gate arrays (FPGA) SoC which allows for extremely high performance and accuracy. This work introduces a novel implementation of a metal detection module on the Terasic Spider Robot; the metal detection module is designed and interfaced with the robot in order to perform the metal detection. The user can control the robot and receive the feedback through a Bluetooth-enabled android phone. In addition, a general-purpose design flow that can be used to implement other applications on this platform is proposed. This proves the versatility of the platform as well.Method The designed metal detection module (MDM) is mainly based on an oscillator and a coil, its operation principle is that when the coil approaches a metal, the frequency of the oscillator will change [2]. This frequency change can be accurately monitored in real time using the FPGA SoC board. Thus, the module can be used for detecting metals. The metal detection module is interfaced with DE0-Nano SoC board where the detection algorithm is implemented. The development of the algorithm is carried out on the board available on this robot. The board includes a FPGA, which provides a high-performance and real-time implementation of parts of the algorithm, and a hard processor system (HPS) running Linux OS which can be used to easily interface the board with other computer systems and peripherals such as mobile phones and cameras[3]. As shown in Fig. 1, the detection algorithm is based on hardware/software co-design; the output of the MDM is provided to the FPGA part of the board in order to achieve an accurate and real-time monitoring. Upon detection, the FPGA sends a detection signal through the shared memory interface to the HPS part of the board. The HPS is then responsible for sending a warning to the mobile through multi-threaded communication application that is running on the HPS. Figure 1 General architecture of the metal detection system In order to implement the metal detection algorithm on the Terasic Spider Robot, it was necessary to formulate and follow the design flow provided in Fig. 2. This design flow can be used to implement other applications that can utilize the hardware/software co-design approach for better performance. Figure 2 General purpose design flow for the Altera Terasic Spider Robot Platform. Results and discussion Due to the coil specification and the circuit design. The frequency captured at normal situations is (no metal presence) is 2155 ± 20 Hz. The frequency increases Inversely proportional to the distance of the metal from the coil. In other words, the frequency increases when the distance between the metal and the coil decrease. When a metal whose size is at least the same size as the coil is present at 7 cm distance from the detection coil, the frequency will exceed 2200 Hz Regardless of the medium. The tested medium is wood. However, similar results were obtained with air medium. These numbers are specific to the proposed system. Changing the circuit parameters will increase the detection distance if desired. For example, having more coil turns and bigger diameter as well as faster oscillation will increase the detection distance. To avoid any interference between the robot body and the metal detection circuit readings, a 15 inches plastic arm is used to connect the metal detection module to the body of the robot. The electronics components is attached to this arm to the nearest possible point to the coil. The metal detection module attached to a plastic arm and then to the robot. the metal detection module and the spider robot is shown in Fig. 3 and 4 respectively. Figure 3 The Metal Detection Circuit Combined with the Arm Fig. 4 MDM Connected to the Terasic Spider Robot The robot is then controlled through a mobile application, the mobile application is modified so that the robot can send feedback (detection warning) to the mobile phone. Figure 5 shows an example of the notification message «Metal Detected» whenever a metal is detected. Figure. 5. Metal Detection Message for Mobile Application Interface Summary and Conclusion This abstract includes a general description of research project that aims to utilize the Terasic Spider Robot platform to perform accurate and real-time metal detection. This is an important application that helps humans avoid involvement in dangerous operations like mine detection. Nonetheless, a general-purpose design flow is proposed for the benefit of the research community and anyone who intends to implement an application on this platform in the future. Acknowledgment This project was funded by Qatar University Internal Grants program. References [1] Y. Zhu, B. Jin, Y. Wu, T. Guo and X. Zhao, «Trajectory Correction and Locomotion Analysis of a Hexapod Walking Robot with Semi-Round Rigid Feet», Sensors, vol. 16, no. 9, p. 1392, 2016. [2] T. Alauddin, M. T. Islam and H. U. Zaman, «Efficient design of a metal detector equipped remote-controlled robotic vehicle,» 2016 International Conference on Microelectronics, Computing and Communications (MicroCom), Durgapur, 2016, pp. 1-5 [3] «Cyclone V Device Overview», Altera, 2016. [Online]. Available: https://www.altera.com/en_US/pdfs/literature/hb/cyclone-v/cv_51001.pdf. [Accessed: 16- Oct- 2017]
-
-
-
MultiObjective SearchBased Requirements Traceability Recovery for Adaptive Systems
Authors: Mohamed Salah Hamdi, Adnane Ghannem and Hany AmmarComplex adaptive systems exhibit emergent behavior. This type of behavior occurs in volatile environments involving cyber-physical systems, such as those aimed at smart cities operations. Adaptive systems’ maintenance aims at improving their performance by dealing with the treatment of continuous and frequently changing requirements. Adaptive systems’ behavior requires therefore up-to-date requirements traceability. To this end, we need to understand the requirements and to localize the program parts that should be modified according to the description of the new requirements. This process is known as Requirements Traceability Recovery (RTR). The process of generating requirements traceability, when done by a human (e.g., system maintainer), is time consuming and error-prone. Currently, most of the approaches, in the literature, are time consuming and semi-automatic that always need the intervention of the user. In our work, we are specifically interested in following the link between requirements and the code of the software system with the aim of helping the designer to update the system appropriately by automating the traceability process. To do this, we formulated the RTR problem as a complex combinatorial problem that could be tackled using Heuristic Search (HS) techniques. These techniques can intelligently explore a big search space (space of possible solutions) for a problem and find an acceptable approximate solution. Varieties of HS techniques exist in the literature. In our work, we use the Non-dominated Sorting Genetic Algorithm (NSGA-II), which is an improved version of a classic Genetic Algorithm (GA). NSGA-II is a multi-objective technique that aims at finding the best compromise between objectives (Pareto Front). The application of NSGA-II to a specific problem (i.e., the requirements traceability recovery problem in our context) requires the accomplishment of the following five elements: 1. Representation of the individuals (vector, tree, etc.), 2. Definition of the evaluation function (fitness function), 3. Selection of the (best) individuals to transmit from one generation to another, 4. Creation of new individuals using genetic operators (crossover and mutation) to explore the search space, 5. Generation of a new population using the selected individuals and the newly created individuals. The proposed approach takes as input a software system, a set of requirements, and the maintenance history of the software system and produces as output the trace links, i.e., the artifacts (classes, methods, etc.) related to each requirement. Three objectives are defined to support the NSGA-II: (1) semantic similarity and (2) Recency of Change (RC), and (3) Frequency of Change (FC). We used the cosine of the angle between the vector that represents the requirement and the vector that represents the software element to measure the semantic similarity. To calculate the RC measure, we used the information extracted from the history of change accumulated during the maintenance process of the software system. The intuition behind introducing the RC measure is that artifacts (classes, methods, etc.) that changed more recently than others are more likely to change now, i.e., are related to the new requirements at hand. To calculate the FC measure, we used the information extracted from the history of change accumulated during the maintenance process of the software system. The intuition behind introducing the FC measure is that artifacts (classes, methods, etc.) that change more frequently than others are more likely to change now, i.e., are related to the new requirement at hand. Each solution consists of assigning each requirement to one or many artifacts (classes, methods) of the system. The solution should maximize as much as possible the three objectives mentioned before. Experiments have been conducted on three open source systems in order to evaluate the approach. The obtained results confirm the effectiveness of the approach in correctly generating the traces between the requirements and classes in the source code with a precision of 91% on average and a recall of 89% on average. We also compared our results to those obtained by two recent works. We noticed that our approach outperforms the two other approaches and has higher average for precision and recall for all three projects.
-
-
-
SLiFi: Exploiting Visible Light Communication VLC to Authenticate WiFi Access Points
Authors: Hafsa Amin, Faryal Asadulla, Aisha Jaffar, Gabriele Oligeri and Mashael Al-SabahThis work presents an effective and efficient solution (SLiFi) to the evil twin attack in wireless networks. The evil twin is a rogue Wifi Access Point (AP) that pretends to be an authentic one by using the same network configuration, including the (i) Service Set Identifier (SSID), (ii) the communication channel, and finally (iii) the MAC address of the purported AP. The evil twin is a trap set-up by an adversary willing to eavesdrop on the user's Internet traffic. The attack is relatively easy to implement, hard to detect and it can have a severe impact on a user's privacy. Many researchers focused on this attack and provided defences from different perspectives: network, access point and client side. Unfortunately, all the solutions provided so far are still not ready for mass deployment since they involve significant modifications to the 802.11 WiFi protocol. In the following, we report some of the most important ones. Gonzales et al. [1] proposed to construct a context vector containing the order of all APs detected at a particular time, with their SSID and RSSI values. This enables the client to compare its future associations with the stored context vector. Bauer et al. [2] proposed SWAT which is a request-response protocol. This approach provides a one-way AP authentication and allows the client to establish a connection to the network through a shared secret key to create a secure session based on the principle of trust-on-first-use (TOFU). Lanze et al. [3] introduced a new technique using the aircrack-ng suite. The tool airbase-ng is set up on all the devices and the beacon frames are collected from various APs. The proposed approach compares the Timing Synchronization Function (TSF) timestamps and their corresponding receiving times in order to spot anomalies due to message proxying and therefore, the presence of a malicious AP. Finally, Gangasagare et al. [4] propose a fingerprinting technique based on network traffic enabling to detect if the AP relays the traffic through another wireless connection. SLiFi does not require any changes to the already existing communication protocols and it enables the access point authentication (by the users) in a fast and reliable way. Indeed, SLiFi enables the user to authenticate the legitimate AP by exploiting a Visible Light Communication (VLC) channel. SLiFi involves two parties, i.e., the (honest) AP provided with a Wi-Fi interface and able to transmit data through a VLC channel, and an end-user, provided with a software that enables data to be read from a VLC channel, e.g., by using a webcam. SLiFi exploits four phases: AP's Public Key (PubKey) broadcast. The AP transmits its own PubKey to the end-user via an authenticated channel (VLC). The PubKey broadcast process is completely transparent to the user since each bit of the PubKey is delivered by quickly switching on and off the light of the room in which the user is. This is achieved by standard techniques of VLC: the human eye cannot perceive the fast blinking light but other devices, such as special webcams, can detect the brightness change. Subsequently, the brightness changes can be translated to a sequence of bit values. Seed generation. The end-user retrieves the public key from the VLC channel by using a webcam and transmits back to the AP a randomly generated seed encrypted with the AP's public key. The PubKey is securely delivered to the user since any other non-authorized light source can be easily spotted. Therefore, only one authorized VLC transmitter will be in place and it will deliver the PubKey of the AP. The client can now use the trusted PubKey to send back to the AP an encrypted seed to be used for the key generation. Secret key generation. The AP receives the user's encrypted seed via the Wi-Fi channel, decrypts the seed using its private key, and sends an acknowledgment message encrypted with the seed back to the end-user. This phase performs the key-agreement and both the AP and the user's device converge to a shared secret key. Encrypted communication. Any further communications between the end-user and the AP will be encrypted with the shared secret key, i.e., the seed generated by the client. SLiFi is compliant with multiple clients, indeed the AP can easily deal with concurrent communications. Moreover, from a practical perspective, SLiFi can be adopted to only generate the shared secret key and passing it to the already existing encryption algorithm, e.g., WPA2 or WPA2-Enterprise. To evaluate SliFi, we built a proof-of-concept using a (1) Raspberry Pi which emulates the AP, a (2) set of LEDs to transmit the PubKey, and (3) standard laptops to act as clients with webcams. All the software components have been implemented and tested. We performed several tests to evaluate the feasibility of our solution. To test reliability of VLC transmission, we ran various experiments to measure the Public key transmission errors as a function of the VLC bit-rate, and we observed that PubKey can reliability transmitted within a reasonable time frame. Finally, our results prove the feasibility of the solution in terms of time to establish the key and robustness to the evil-twin attack. References 1. H. Gonzales, K. Bauer, J. Lindqvist, D. McCoy, and D. Sicker. Practical Defenses for Evil Twin Attacks in 802.11. In IEEE Globecom Communications and Information Security Symposium (Globecom 2010), Miami, FL, December 2010. 2. K. Bauer, H. Gonzales, and D. McCoy. Mitigating Evil Twin Attacks in 802.11. January 2009. 3. F. Lanze, A. Panchenko, T. Engel, and I. Alcaide. Undesired Relatives: Protection Mechanisms against the Evil Twin Attack in IEEE 802.11. 4. M. Gangasagare. Active User-Side Evil Twin Access Point Detection. International Journal of Scientific & Engineering Research, May 2014.
-
-
-
QEvents: RealTime Recommendation of Neighboring Events
Authors: Heba Hussein, Sofiane Abbar and Monishaa ArulalanTechnology always seeks to improve the little details in our lives for a faster and a more efficient life-pace. One of these little problems we face in our daily lives is finding relevant events. For example you visit a place like Katara with your kids and you spend your time in vain looking for a fun event, and after you leave the venue a friend of yours tells you about this interesting “Henna and face painting workshop organized in building 25.”. To solve this problem we propose QEvents. QEvents is a platform that provides users with real-time recommendations about events happening around their location and that best match their preferences. QEvents renders the events in a map-centric dashboard to allow easy browsing and user-friendly interactions. QEvents continuously listens to online channels that broadcast information about events taking place in Qatar, including specialized websites (e.g. eventsdoha.com), social media (e.g. Twitter), and news (e.g. dohanews.com). The main challenge QEvents strives to solve is how to extract important features such as title, location, and time from free text describing the events. We will show in this paper how one could leverage existing technologies such as Topic modeling, Named Entity Recognition, and advanced text parsing to transform a plain event listing website into a dynamic and alive service capable of recognizing events’ location, title, category, as well as starting and ending time, and nicely rendering them in a map-centric visualization allowing a more natural exploration.
-
-
-
Driver Drowsiness Detection Study using Heart Rate Variability analysis in Virtual Reality Environment
Introduction Mobility and road safety is one of the grand challenges that Qatar is facing during the last decade. There are many ways to enhance the road safety. One way is to characterize the factors contributing to the road fatalities. According to Transport Accident Commission, about 20% of fatal road accidents are caused by driver fatigue [1]. As reported by Monthly Qatar Statistics in [2], the total number of deaths for the first 8 months of the current year is 116. Thus, around 23 of the casualties are caused by driver fatigue. According to the U.S. Department of Transportation's NHTSA, in 2016 the number of fatalities involving drowsy driver is 803, which is 2.1% of total fatalities in the same year in US [3]. Therefore, it is essential to design and implement an embedded system in vehicles that can analyze, detect, and recognize the driver's state. The main aim of this project is to detect and recognize different drowsiness states using electrocardiogram (ECG) based Heart Rate Variability (HRV) analysis through heartbeats data acquisition while he/she is driving the car in different timings of the day. Then an alarm is produced before the driver's situation reaches the dangerous case that might lead him/her to involve in an accident. Background A driver»s drowsiness state can be detected through different methods. One of the most accurate methods is to get the HRV information acquired from Electrocardiogram (ECG) signal helps to identify different states like awake, dizziness, drowsiness and sleep behind the steering. HRV describes the involuntary nervous function, which is in fact the R-to-R interval (RRI) variations of an acquired ECG signal [4]. By identifying the RRI as well as the distance between the RR peaks, we can decide if the driver is in drowsy state or not, by analyzing HRV time and frequency domain features. Low Frequency (LF) band (0.04–0.15 Hz) describes the sympathetic and parasympathetic activities of the heart activity whereas; High Frequency (HF) band (0.15–0.4 Hz) describes only the parasympathetic activities of the heart activity [4]. The LF/HF ratio reflects the differences between awake and drowsy states while the ratio was decreasing gradually from the awake state to drowsy state [5-6].Method A portable wireless BioRadio (Fig.2 A) (Great Lakes NeuroTechnologies, Inc.) Electrocardiogram (ECG) system was used with three Ag/AgCl electrodes attached to a participant's chest. The points of attachment are (i) two electrodes under the right and left collarbone, and (ii) one electrode under the lowest left rib bone of the participant. ECG signal was band passed through a filter (0.05-100 Hz) digitized at a sampling frequency of 500 Hz with 12-bit resolution to be displayed on the device GUI software BioCapture. Data were stored from BioCapture software on the hard disk of an Intel Core i7 Personal Computer for off line analysis. The simulation of highway driving was created in virtual reality 3D cave environment (Fig. 2B) (in VR lab, Research Building, Qatar University). Simulation scenario was a two-way highway with two lanes in each direction, low density of traffic, late afternoon and/night environment, path with no sharp curves and rural environment with far apart trees. ECG data were recorded from three subjects while the subjects were driving monotonously a car in VR environment during active and drowsy states. A camera from the front was used to detect the drowsiness stages, and to segment the ECG data based on drowsiness. ECG data of each subject was exported using the Bio-Capture software and segmented using CSV splitter to analyze the data by Kubois software. ECG signal was recorded from each subject for one hour approximately until the subject becomes drowsy. The one-hour sample data was splitted into six segments, each with 10 minutes duration. This was done to make the analysis of each sample easier and to be able to specify and identify exactly the time when the subject was awake and/or drowsy. Result and Discussion Fig. 3 shows the sample ECG trace from subject one and selected RR intervals were calculated using Kubios HRV software and the RR series was produced by interpolation. This RR time series was used to calculate heart rate (HR) and HRV using the same software. The RR time series was used to calculate the power spectral density (PSD) by applying the Fast Fourier Transform (FFT) method to identify the LF and HF frequency component of the HRV. Figure 4 shows the PSD averaged over trials for sample participants in case of active and drowsy states. As it can be seen from Fig. 4, there is a significant difference in the LF/HF ratios, as it decreased drastically from 4.164 (Fig. 4A) when subject was awake to 1.355 (Fig. 4B) when subject was drowsy. In addition, HF and LF alone can be taken as indicators for drowsiness. The HF increased from 163 ms2 when subject was awake to 980 ms2 when subject was drowsy. Moreover, the LF value also increased from 679 to 1328 ms2. The summary of the LF/HF for different participants are shown in Table 1. Table 1 clearly shows that LF/HF is higher for all the subjects during their active states and the ratio is decreasing, as the subject was getting drowsy. This result is in line with the findings of other researchers.Conclusion It can be summarized from the findings from this experiment that the HRV based drowsiness detection technique can be implemented in single board computer to provide a portable solution to be deployed in the car. Depending on the sleep stages detected through HRV analysis, the driver can be alerted through either piezoelectric sensor or audible alerting message, which will help to reduce significant road accidents.
-
-
-
Scientific Data Visualization in an Immersive and Collaborative Environment
More LessTremendous interest in visualizing massive datasets has promoted tiled-display wall systems that offer an immersive and collaborative environment with an extremely high-resolution. To achieve an efficient visualization, the rendering process should be parallelized and distributed among multiple nodes. The Data Observatory at Imperial College London has a unique setup consisting of 64 screens powered by 32 machines providing a resolution of over 130 megapixels. Various applications have been developed to achieve a high-performance visualization by implementing parallel rendering techniques and incorporating distributed rendering frameworks. ParaView is one such application that targets the visualization of scientific datasets by taking computing efficiency into consideration. The main objective of this project is to leverage the potential of the Data Observatory and ParaView regarding visualization by fostering data exploration, analysis, and collaboration through a scalable and high-performance approach. The primary concept is to configure ParaView on a distributed clustered network and associate the appropriate view for each screen by controlling ParaView's virtual camera. The interaction events with the application should be broadcasted to all connected nodes in the cluster to update their views accordingly. The major challenges of such implementations are synchronizing the rendering across all screens, maintaining data coherency, and managing data partitioning. Moreover, the project is aimed at evaluating the effectiveness of large display systems compared to typical desktop screens. This has been achieved by conducting two quantitative studies assessing the individual and collaborative task performance. The first task was designed to investigate the mental rotation of individuals by displaying a pair of a 3D model, as proposed by Shepard-Metzler, on the screen with different orientations. Then, the participant was asked if both models were the same or mirrored. This would lead to evaluate the individual task performance by studying the ability to recognize the orientation change in 3D objects. It consisted of two levels: easy and hard. For the easy level, the second model was rotated for a maximum angle of 30 on two axes. In contrast, the hard level had no limitation on the angle of rotation. The second task was developed specifically for ParaView to assess the collaboration aspect. The participants had to use the basic navigational operations to find hidden needles injected in a 3D brain model in 90 seconds. In both tasks, the time taken to complete the task and the correctness were measured in two environments: 1) the Data Observatory, and 2) a simple desktop screen. The average correct responses of the mental rotation task have been calculated for all participants. It has been shown that the number of correct answers in the Data Observatory is significantly higher than on the desktop screen despite the amount of rotation. The participants could better distinguish mirrored objects from similar ones in the Data Observatory with a percentage of 86.7% and 73.3% in easy and hard levels, respectively. However, on the typical desktop screen, participants correctly answered less than half of the hard level questions. This indicates immersive large display environments provides a better representation and depth perception of 3D objects. Thus, improving the task performance of visualizing 3D scenes in fields that require the ability to detect variations in the position or orientation. Overall, the average completion time of both displays in the easy task is relatively the same. In contrast, the participants required a longer time to complete the hard task in the Data Observatory. This could be because of the large display space, which occupies a wide visual field, thus providing an opportunity to the viewers to ponder and think about the right answer. In the collaborative search task, the participants found all the hidden needles within the time limitation in the Data Observatory. The fastest group completed the task in 36 seconds while the longest recorded time was around one minute and 12 seconds. However, on the desktop screen, all participants consumed the full 90 seconds. In the small screen environment, the mean of the correct responses is estimated as 55%. The maximum number of needles found was 3 out of 4, which was achieved by only one group. To evaluate the overall efficiency of the Data Observatory, the one-way ANOVA test was used to find significant effects regarding the correctness of both tasks. The completion time was discarded from this analysis because of the differences in the tasks’ nature. The ANOVA revealed a significant impact of the display type in the number of correct responses, F1,48 = 10.517, p < 0.002. This indicates participants performed better in the Data Observatory in contrast to the simple desktop screen. Therefore, these results support the hypothesis of the effectiveness of large displays in improving the task performance and collaborative activities in terms of accuracy. The integration of both system solutions provides a novel approach to visualize the enormous amount of data generated from complex scientific computing. This adds great value to researchers and scientists to analyze, discuss, and discover the underlying behavior of certain phenomena.
-
-
-
Virtual Reality Game for Falconry
Authors: Noora Fetais, Sarah Lotfi Kharbach, Nour Moslem Haj Ahmad and Salma Salah AhmadTraditions and culture play a major role in our society, as they are the source of a person's pride and honor. One of Qatar's National Vision 2030 pillars that relates to the social development aims at preserving Qatar's national heritage. Thus, from this perspective, an innovative idea to use Virtual Reality (VR) technology to preserve traditions evolved. The game simulates the genuine Qatari Hunting Sport, which is considered as one of the most famous traditional sports in Qatar. However, practicing this sport is very expensive in terms of time, efforts and resources. Since this sport is challenging physically, only male adults can join. This project will not only preserve the traditional sport from extinction, but will also allow children from both genders to participate in it. The game will be an innovative means to help spreading Qatari heritage by commercializing it to the world. Moreover, it will help to learn the rules of such a sport in a safe and entertaining environment. The game is one of its kind since it is merging technology and heritage at the same time. The game is a virtual reality game that teaches younger generations about their antecedents' pastimes. It is a simulation of traditional falcon sport that will teach children, step by step and in an attractive manner, the basics of the sport like holding the falcon, making the falcon fly, and much more. In addition to that, we are cooperating with a hardware team from computer engineering that is working on customizing a glove that will ensure total immersion of the player in the game by making him feel pulled whenever the falcon is on his hand and release the pull when the falcon is not. Another main idea behind this project is to develop a strong relationship between the Qatari people and their heritage, which would then be more accessible throughout the year, instead of only special occasions. It will also help expats in Qatar to explore such an extraordinary heritage game on national events like national day, sport day... This project stands out with its original idea and captivating implemented features like the desert environment, the realistic audios, the visual effects, the gameplay... The game in not limited to only visual effects, although it is a key element, yet behind it countless algorithm implementations and deployment processes. It was crucial to conduct an ethnography study to accurately simulate the game by visiting Qatari society of AlGannas a specialist meeting with a specialist mentor to know more about the hunting sport in Qatar, and collecting more information about different falcon species in the state. This game can serve as a great ambassador of the Qatari falconry hunting sport in local and international events. Falconry is not limited to Qatar. Since 2012, this sport has been considered as an intangible cultural heritage of humanity according to the UNESCO. We tried to customize the game to make it exclusively designed for Qatar by adding features that only Qatari hunters do like holding the falcon on the left hand only.
-
-
-
Robotic Probe Positioning System for Structural Health Monitoring
Authors: Ali Ijaz, Muhammad Ali Akbar and Uvais QidwaiStructural health Monitoring (SHM) is a very critical component for sustainable civil and mechanical structures in modern urban settings. The sky-scrappers and huge bridges in modern metropolis today are essential aspects of the prosperity and development of a country but at the same time they present a great challenge in terms of maintaining and sustaining the structures in a good health. Due to the complex designs of these structures, it is typically very dangerous to do SHM tasks through human personnel. Deployment of a monitoring team with various forms of equipment and scaffolding accompanied with their hoisting machines becomes extremely exorbitant for the maintenance and planning of the structures causing unnecessary cost-spill on other areas of the available budget. For most of the metallic structures, a fast method of scanning an area more closely is the Magnetic Flux Leakage (MFL) based defect detection. The MFL is considered the most economical approach for inspecting the metallic structures. Traditionally a hand-held device is used for performing the MFL inspection. In this paper, an autonomous MFL inspection robot has been presented which is small, flexible and remotely accessible. The robot is constructed with an Aluminum chassis, driven by two servomotors and holds a stack of very powerful Neodymium magnets to produce the required magnetic circuit. As the robot moves on a metallic surface, the magnetic circuit produces a layered magnetic field just under the scanning probe. The probe is composed of several Hall-effect sensors to detect any leakage in the magnetic circuit, which happens due to abnormality in the surface, thus detecting an anomaly. In this paper, a coordinated robotic inspection system has been proposed that utilizes a set of drones with one positioning robotic crawler platform with additional load hoisting capabilities that are utilized in order to position a specific defect-locating probe on the building under scan. Proposed methodology can play a vital role in SHM since it is capable of scanning a specific area and transmit back the results in a shorter time with a very safe mode of operation. This method is more reliable as compared to fixed sensors that focus a particular area of the structure only. Design for SHM robot involves intelligent integration of navigation system comprising of crucial parts that act as its backbone and assist the robot to work autonomously. These parts include GPS module, compass, range sensor, Infrared (IR) sensor along with MFL probe and winch setup and powerful PMDC Servo Motor controller (MC 160) used to drive two (2) powerful motors. The MC160 brushed Motor Controller proves to be a perfect platform for controlling Brushed DC motors. The controller consists of two power drivers in addition to OSMC connector for a third power driver (winch motor control). All these things add extra degrees of freedom to the robotic system for SHM. Novelty of the methodology is that the robot's program logic is not fixed. It is flexible in terms of path following. It has ability to detect an obstacle while it is on its way to scan the building. It not only detects obstacle but also changes its course and automatically adopts new route to the target destination. Such an autonomous robotic system can play a vital role in Structural Health Monitoring (SHM) in contrast to manual inspection eliminating the need of physical presence of human in severe weather conditions. The presented methodology is condition based in contrast to schedule-based approach. Core scan is easily done and robot is reconfigurable in a sense that it automatically changes its course to adopt to rough terrain and avoids obstacles on its way. Easy deployment makes robot an excellent choice for SHM with minimum cost and enhanced flexibility. Proposed robotic system can perform a coarse level of scan of a tall building using drones and the probe deployment robots (PDR). The drones provide a rough estimate of the location of possible defect or abnormality and PDR inspects the anomaly more closely. In addition, the coarse information about a possible defect can also help in deploying other means of inspection in a much lower cost since the whole structure needs not to be inspected.
-
-
-
Coordinated Robotic System for Civil Structural Health Monitoring
Authors: Muhammad Ali Akbar and Uvais Ahmed QidwaiWith the recent advances in sensors, robotics, unmanned aerial vehicles, communication, and information technologies, it is now feasible to move towards the vision of ubiquitous cities, where virtually everything throughout the city is linked to an information system through technologies such as wireless networking and radio-frequency identification (RFID) tags, to provide systematic and more efficient management of urban systems, including civil and mechanical infrastructure monitoring, to achieve the goal of resilient and sustainable societies. In the proposed system, unmanned aerial vehicle (UAVs) is used to ascertain the coarse defect signature using panoramic imaging. This involves image stitching and registration so that a complete view of the surface is seen with reference to a common reference or origin point. Thereafter, crack verification and localization has been done using the magnetic flux leakage (MFL) approach which has been performed with the help of a coordinated robotic system. In which the first modular robot (FMR) is placed at the top of the structure whereas the second modular robot (SMR) is equipped with the designed MFL sensory system. With the initial findings, the proposed system identifies and localize the crack in the given structure. Research Methodology: The proposed approach used the advantages of the visual and MFL inspection approach to improve the efficiency of the SHM. Therefore, the usage of both approaches should be done in a way that the whole inspection is carried out in an optimal time period. Thus, due to the fast processing of visual inspection, it is done first followed by an MFL based verification approach. The visual inspection has been carried out such that the drone will take-off from a fixed point and take images at different heights without changing the GPS coordinate values of start point during flight. After completing the first scan, the coordinates of the GPS will be shifted and same procedure of taking images at different heights will be conducted. The process remain continue until the drone reaches to the starting GPS coordinates. The images which were taken at different heights for particular coordinates are considered as a single set. Thereafter, the image stitching (IS) is applied on individual sets. The process of IS involves a series of steps which were applied on the consecutive images of a particular set, such that one of the image is taken as a reference image (RI) whereas the other one is termed as the current image (CI). The resultant stitched image will be RI for the next consecutive image and then the whole stitching process is applied. The process remain continue for each set until a final stitched image has been obtained from them. The stitched result will be saved in the database with its corresponding GPS values. The same procedure of taking and stitching the images of the same structure will be repeated again after few months, depending upon the structural sensitivity as well as the severity of the weather condition around it. The current results will be compared with the stitched images present in the data base and if some anomaly is detected then the HP coordinates (i.e. the GPS coordinates) along with the estimated height for that particular location will be sent to the FMR to proceed the crack verification using MFL. The GPS module present in the FMR will guide the robot about its own location. As soon as Arduino Mega2560 Microcontroller receives the GPS coordinates from the system. It will translate them and compare them with its current location. The need of translation is because the FMR is present at the top of the building whereas the drone is flying at particular distance from the building. In order to obtain a correct translation the drone should remain at particular distance form in structure during the whole scanning process. The robot will take its direction based on the comparison result between its current GPS coordinates and the translated received GPS coordinates. As the robot moves it will keep checking the current GPS values and take decision accordingly. Since there might be some temporary or permanent obstacle present on the roof for decoration purpose. Therefore an ultrasonic range sensor has been used such that when the robot come close to an obstacle at defined distance the sensor will guide the robot to change its path and as soon as the obstacle is disappeared from the sensor range the robot will again start checking the GPS value to reach to its target destination. As it reaches to the target destination it will instruct the wrench motor to allow the SMR to reach to the location and obtain the current MFL reading of that place. These readings will be sent to the System. If an anomaly is detected then it is verified that the structure is having deformation at that particular location. If in vision based approach multiple anomalies have been detected then the robot will perform same procedure to determine the faults. Conclusion: With the initial findings, the proposed system appears to be a robust and inexpensive alternative to current approaches for automated inspection of civil/mechanical systems. The combination of VI and MFL approach provided the opportunity to detect, verify and localize the deformation in the structure.
-
-
-
Visualization of Wearable Data and Biometrics for Analysis and Recommendations in Childhood Obesity
Authors: Michael Aupetit, Luis Fernandez-Luque, Meghna Singh, Mohamed Ahmedna and Jaideep SrivastavaObesity is one of the major health risk factors behind the rise of non-communicable conditions. Understanding the factors influencing obesity is very complex since there are many variables that can affect the health behaviors leading to it. Nowadays, multiple data sources can be used to study health behaviors, such as wearable sensors for physical activity and sleep, social media, mobile and health data. In this paper we describe the design of a dashboard for the visualization of actigraphy and biometric data from a childhood obesity camp in Qatar. This dashboard allows quantitative discoveries that can be used to guide patient behavior and orient qualitative research. CONTEXT Childhood obesity is a growing epidemic, and with technological advancements, new tools can be used to monitor and analyze lifestyle factors leading to obesity, which in turn can help in timely health behavior modifications. In this paper we present a tool for visualization of personal health data, which can assist healthcare professionals in designing personalized interventions for improving health. The data used for the tool was collected as part of a research project called «Adaptive Cognitive Behavioral Approach to Addressing Overweight and Obesity among Qatari Youth» (ICAN). The ICAN project was funded by the Qatar National Research Fund (a member of Qatar Foundation) under project number NPRP X- 036- 3–013. The participants in the study were involved in activities aimed at improving their health behavior and losing weight. All participants and their parents/guardians provided informed consent prior to participation. Data from various sources (social media, mobile, wearables and health records) were collected from subjects and linked using a unique subject identifier. These datasets provided what we have defined as a 360-degree Quantified Self (360QS) view of individuals We have focused on the visualization of the biometrics and physical activity data. We proposed different visualization techniques to analyze the activity patterns of participants in the obesity trial. Our dashboard is designed to compare data across time, and among reference individuals and groups. DATA FROM OBESE CHILDREN Biometric data were measured periodically and included height, weight and the derived body-mass index (BMI), body fat percentage, waist circumference and blood pressure for each individual. Physical activity data was collected via accelerometers. The raw signals have been quantized into four activity levels: sedentary, light, moderate and vigorous, using a human activity recognition algorithm. INTERACTIVE ANALYTIC DASHBOARD The objective of the dashboard is to provide an overview of the actigraphy data and enable primary data exploration by an expert user. In a Control Panel, drop down menus enable selecting two subjects or groups of subjects to be compared based on identifiers and gender. During data collection, some devices were not worn at all times; hence they recorded long periods of «sedentary» activity. The user can use check-boxes to select the biometrics she wants to compare (e.g., BMI and Body Fat Percentage). A Visualization Panel shows both selected (groups of) subjects as bar charts indicating the hours of activity with activity level breakdown per day through time. The color legend of the bars is shown in the control panel: reddish colors for moderate (light red) and vigorous (dark red) activity levels, and bluish colors for light (light blue) and sedentary (dark blue) activity levels. The user can select a time window by brushing horizontally on the time range to zoom in or out. Two line chart show the biometrics selected in the control panel, with line colors corresponding to the selected (groups of) subjects. The average activity breakdown by activity levels is also displayed for weekdays and for weekend days. QUANTITATIVE ANALYSIS AND SUPPORT Thanks to our dashboard we can easily identify trends in biometrics, and compare activity level during week days and week-end days to support lifestyle recommendation. CONCLUSION This interface is a tool to give primary overview of the data likely to orient more detailed analysis. For instance, a more in-depth study of the relation between sleep duration and BMI could be conducted. Another outcome related to the experimental setup would consist of recommending biometrics to be measured more often, or to find incentives for subjects to wear the devices more consistently. A health expert could also provide the subject with a target status (e.g. weight) to compare and converge to, along with recommendations about the activities he/she should improve: e.g., go to bed earlier, wake up earlier during weekend, have more vigorous activities during afternoon, etc. Other available tools, such as the Fitbit dashboard (Fitbit Inc, USA), do not give detailed activity levels across time nor comparison with reference individual. Our next steps include performing qualitative evaluation of our dashboard and improvements based on the end users» feedback.
-
-
-
Variable Message Sign strategies for Congestion Warning on Motorways
Authors: Wael Khaleel Mohammad Alhajyaseen, Nora Reinolsmann, Kris Brijs and Tom Brijs1. Introduction Motorways are the safest roads by design and regulation. Still, motorways in the European Union accounted for nearly 27,500 fatalities from 2004 to 2013 (Adminaite, Allsop, & Jost, 2015). The likelihood of the occurrence of rear-end collisions increases with higher traffic densities. This is alarming considering the proportion of traffic on motorways has increased over the past decade (Adminaite et al., 2015). The initiation of traffic congestion is characterized by changing flow conditions which can pose a serious safety hazard to drivers (Marchesini & Weijermars, 2010). Especially, hard congestion tails force drivers to change from motorway speed to stopped conditions, which can result in severe rear-end crashes (Totzke, Naujoks, Mühlbacher, & Krüger, 2012). Fatalities and injuries due to motorway crashes represent a threat to public health and should be reduced as much as possible. 2. Congestion warning and VMS The effects of congestion on safety generally depend on the extent to which drivers are surprised by the congestion. The type of congestion, the location of the queue, and the use of variable message signs to warn drivers in advance can influence whether drivers are able to decelerate safely or not (Marchesini & Weijermars, 2010). Variable message signs (VMS) are considered one of the primary components of Intelligent Transportation Systems (ITS) and provide motorists with route-specific information or warnings. The advantage of VMS is that they can display traffic state messages dynamically and in real time. Accordingly, VMS can reduce uncertainties and prepare drivers to anticipate and safely adapt to a traffic event (Arbaiza & Lucas-Alba, 2012). The Easyway II Project is one of the important guidelines for VMS harmonization in Europe that have been developed to update and improve current VMS signing practices. Despite this effort towards harmonization, still a broad variety of sign designs, message types and field placements are applied to warn drivers about congestions tails. Also, empirical research testing the available guidelines provides inconsistent findings. Hence, further scientific research is needed to shed more light on the effectiveness of different VMS types, message designs, and placement to influence save driving performance. 3. Objectives Available guidelines suggest that advance warning messages should be placed at 1 km, 2 km, and 4 km prior to a traffic event if the purpose is to allow drivers to anticipate safely (i.e., tactical use of VMS), and no further than 10km prior to a traffic event when the purpose is to influence route choice, rather than driver behavior (i.e., strategic use of VMS) (Evans, 2011; Federal Highway Administration, 2000). Gantry overhead signals and cantilever side poles are the most common VMS types. The Easyway guidelines contain different formats for congestion warning messages, namely, messages containing a) pictograms of congestion with or without a redundant text unit, b) a maximum of 4 information units, and c) with or without distance information (Arbaiza & Lucas-Alba, 2012). The objective of this study was to analyze the effect of different congestion warning VMS formats on visual- and driver behavior on motorways leading to a hard congestion tail. To that purpose, we used a driving simulator to observe accidents, speed and deceleration, and an eye tracker to monitor gaze fixations. 4. Method Data of thirty-six drivers (male and female) with an average age of 43 years were collected. We implemented a within-subject design with all participants exposed to seven VMS scenarios in randomized order. The apparatus used was the driving simulator of the Transportation Research Institute (IMOB, UHasselt), which is a ‘medium-fidelity’ simulator (STISIM M400; Systems Technology Incorporated) with a ‘fixed-base’ logging a wide range of driving parameters. The mock-up consists of a Ford Mondeo with a steering wheel, direction indicators, brake pedal, accelerator, clutch, and manual transmission. The virtual environment is visualized through three projectors on a 180° screen including three rear-view mirrors. Furthermore, we used the eye tracking system FaceLAB 5.0 to record eye movements. The eye tracker was installed on the dashboard of the driving cab and accommodated head rotations of +/-45 ° and gaze rotations of +/-22 ° around horizontal-axis. 5. Results We found that drivers with higher initial speeds stop closer to the congestion tail and are more likely to have a rear-end crash. A gantry-mounted congestion warning with a pictogram and the word “congestion” presented at a distance of 1km resulted in lowest mean speeds and smoothest deceleration for all drivers. A congestion warning at a distance of more than 3km had no effect on driver behavior in the critical zone before the congestion tail. Eye fixations for gantry mounted VMS were more frequent, but shorter in time as compared to cantilevers. Finally, the imposed visual load on drivers increased with more information units on the VMS. 6. Conclusion The distance between the congestion warning and the actual congestion tail is a crucial aspect when it comes to the effectiveness this kind of VMS. VMS congestion warnings located too far away lose their effect in the critical approaching zone, and VMS congestion warnings located too close might compromise safe deceleration. A gantry-mounted congestion warning displaying the word ‘congestion’ together with a pictogram located at 1km before the congestion tail was clearly noticed from all lanes without imposing too much visual load, and had the best impact on speed, resulting in smooth deceleration and safe stopping distances. In contrast, a congestion warning located more than 3km from the actual congestion tail had no safety effect as drivers started to speed up again before reaching the critical approaching zone. 7. Acknowledgment This publication was made possible by the NPRP award [NPRP 9-360-2-150] from the Qatar National Research Fund (a member of The Qatar Foundation). The statements made herein are solely the responsibility of the author[s]. 8. References Adminaite, D., Allsop, R., & Jost, G. (2015). ETSC - RANKING EU PROGRESS ON IMPROVING MOTORWAY SAFETY PIN Flash Report 28, (March). https://doi.org/10.1016/j.trf.2014.06.016 Arbaiza, A., & Lucas-Alba, A. (2012). Variable Message Signs Harmonisation PRINCIPLES OF VMS MESSAGES DESIGN Supporting guideline, (December), 1–60. Evans, D. (2011). Highways Agency policy for the use of Variable Signs and Signals (VSS), (December). Federal Highway Administration. (2000). CHAPTER 2E. GUIDE SIGNS - FREEWAYS AND EXPRESSWAYS, 1–82. Marchesini, P., & Weijermars, W. (2010). The relationship between road safety and congestion on motorways. SWOV Institute for Road Safety Research, 28. https://doi.org/R-2010-12 Totzke, I., Naujoks, F., Mühlbacher, D., & Krüger, H. P. (2012). Precision of congestion warnings: Do drivers really need warnings with precise information about the position of the congestion tail. Human Factors of Systems and Technology, 235–247.
-
-
-
Qatar Meteorology Department Security Enhancement Recommendations
More LessThe Internet has become part of almost every organizational culture where security threats increase with the increase use of the Internet. Therefore, security practice has become increasingly important for almost every organization where is the same case of the Qatar Meteorology Department (QMD) in the state of Qatar. The aim of this research is to evaluate the current security level of the QMD by examining the current security practices that present in the organization and the security awareness level among the organization employees. After that, provide the organization with security policy and awareness program recommendations to enhance the organization security practice. Furthermore, the importance of this research is its contribution to enhance the organization security level. In order to achieve the research objectives, a mixture of different methodologies has been used to collect the fundamental data includes: survey questionnaires, interviews, and field observation. For the data collection process to success, a number of strategies have been used in each method to ensure achieving the most benefits of each the used method. These methods satisfied in collecting the essential primary data. Furthermore, a number of literatures were reviewed in order to understand the research subject further. Based on the collected data, a number of analysis methods have been used to draw a conclusion of the organizational security level where the findings illustrate the needs for security policies and awareness programs in order to enhance the organization security level. Thus, a number of security policies and awareness program recommendations have been established. The research findings and the provided recommendations can support the organization to enhance its security level as much as possible since no system is completely secure. Furthermore, this research presents valuable information about the organization current security level and provides recommendations to enhance this security level.
-
-
-
A Reverse MultipleChoice Based mLearning System
Authors: AbdelGhani Karkar, Indu Anand and Lamia DjoudiMobile learning can help in accelerating the students’ learning strengths and comprehension skills. Due to the immediacy and effectiveness of mobile learning, many mobile educational systems with diverse assessments techniques have been proposed. However, we observe a common limitation in existing assessments techniques, such as, the learner cannot correlate question and answer choices or freely adapt answers in a given multiple-choice question, often resulting in incorrect assessment grade. In the current work, we present a reverse multiple-choice mobile learning system that is based on knowledge acquisition. Using a knowledge base, a set of answer choices will be created for a multiple-choice question. For each of one or more of the incorrect answers, a follow-up query is generated for which the incorrect answer is correct. The goal is to find, via a query, an optimal association between the incorrect answers and the correct answer. The user studies of the proposed system demonstrated its efficiency and effectiveness.Keywords—Mobile Learning, Knowledge Acquisition, Multiple Choice, Expert Systems.I. IntroductionNowadays, mobile devices opened a new horizon for learning. As most people own handheld private portable smart phones, this has become main medium of connectivity and reexamination. Using smart-devices for learning is beneficial and attractive as the learner can access educational materials and access assessment exercises at any time. However, existing assessment technique such as multiple-choice technique [1] does not enable a learner to modify answers in the given multiple-choice question resulting inaccurate assessment grade. For this reason, the attested research work was to extend the former multiple-answers question technique with the ability of selecting wrong answers in mobile learning scope. Thus, extra-assessments will be carried out to assess the knowledge of the learner using the selected wrong answer. II. Review of the LiteratureSeveral mobile learning applications have been proposed due to their ability in providing more engaging and successful learning environments [2]. Chen et al. [3] proposed a mobile learning system that provides multistage guiding mechanisms when the student selects wrong answer in a multiple-choice question. The proposed system enhanced the learning achievements of students and their learning motivation. Huang et al. [4] developed a mobile learning tool to improve learning the English language for foreign language (EFL) students. The tool uses 5-step vocabulary learning (FSVL) strategy. Thus, it employs the former multiple-choice questions in order to assess the learning of students. Koorsse et al. [5] proposed a mobile based system that uses two multiple-choice assessment methods. The assessment methods use self-regulated principles to support the learning of students in the secondary school of science and mathematics. As many mobile based educational systems have been proposed, adapting multiple-choice questions according to a selected wrong answer was not considered in previous mobile based educational systems. Hence, our system can be used to enhance the learning assessments of learners.III. The Proposed SystemOur proposed system provides educational content and uses a novel assessment technique based on reverse-multiple choice [6]. The system can be used in classroom to assess the learning of students. The proposed system covers: 1) presentation of educational content, 2) generation of multiple-choice based questions including their follow-up queries, and 3) performance analysis of the student. For the presentation of the content, we have created an educational depository that contains collection of educational stories. These stories are collected from diverse online ebook libraries such as MagicBlox library [7], BookRix [8], and others. For the multiple-choice questions, we start with the familiar multiple-choice format [1], which we call “Reverse Multiple-Choice Method” (RMCM). The question uses the power of wrong answer choices not just as “distractors,” but to extract information about students’ depth of learning from brief, machine gradable answers. RMCM question asks a student to weigh why a particular answer choice is incorrect, identify segment(s) of the query on which the answer turns, then change those segment(s) to make it correct. Indeed, the examiner must carefully select the answer choices for a multiple-choice query, but RMCM question databanks have lasting value and high re-usability; even having seen a question earlier, an examinee must answer it thoughtfully. The RMCM approach suits m-learning environments especially, since thinking comprises most effort and actual answers are brief. Eventually, for the performance analysis of students, we use the total number of correct answers done by the student to assess his/her performance. When a reverse multiple-choice option is employed, the grade will be computed according to the number of correct attempts achieved by the student. Thus, for every wrong attempt the performance is decreased by certain percentage.Bibliography[1] K. M. Scouller and M. Prosser, “Students’ experiences in studying for multiple choice question examinations,” Studies in Higher Education, vol. 19, no. 3, pp. 267-279, Jan. 1994.[2] K. Wilkinson and P. Barter, “Do mobile learning devices enhance learning in higher education anatomy classrooms?,” Journal of pedagogic development, vol. 6, no. 1, 2016.[3] C. H. Chen, G. Z. Liu, and G. J. Hwang, “Interaction between gaming and multistage guiding strategies on students’ field trip mobile learning performance and motivation,” British Journal of Educational Technology, vol. 47, no. 6, pp. 1032-1050, 2016.[4] C. S. Huang, S. J. Yang, T. H. Chiang, and A. Y. Su, “Effects of situated mobile learning approach on learning motivation and performance of EFL students,” Journal of Educational Technology & Society, vol. 19, no. 1, 2016.[5] M. Koorsse, W. Olivier, and J. Greyling, “Self-Regulated Mobile Learning and Assessment: An Evaluation of Assessment Interfaces,” Journal of Information Technology Education: Innovations in Practice, vol. 13, pp. 89-109, 2014.[6] I. M. Anand, “Reverse Multiple-Choice Based Clustering for Machine Learning and Knowledge Acquisition,” International Conference on Computational Science and Computational Intelligence (CSCI), vol. 1, p. 431, 2014.[7] “MagicBlox Children's Book Library.” Available: http://magicblox.com/. Accessed: 20-Oct-2017.[8] “BookRix.” Available: https://www.bookrix.com/. Accessed: 20-Oct-2017.
-
-
-
Tackling item coldstart in recommender systems using word embedding
By Manoj ReddyWe live in the digital age where most of our activities and services are carried out over the internet. Items such as music, movies, products etc. are being consumed over the web by millions of users. The number of such items is large enough that it is impossible for a user to experience everything. This is where recommender systems come into play. Recommender systems are employed to play a crucial role of filtering and ranking items to each user based on their individual preferences. Recommender systems essentially assist the user in making decisions to overcome the problem of information overload. These systems are responsible for understanding a user's interests and inferring their needs over time. Recommender systems are widely employed across the web and in many cases, are the core aspect of a business. For example, on Quora, a question-answering website, the entire interface relies on the recommender system for deciding what content to display to the user. The content ranges from homepage question ranking, topics recommendation and answer ranking. The goal of a recommender system is to assist users in selecting items based on their personal interest. By doing so, it also increases the number of transactions thereby creating a win-win situation for both the end users and the web service. Recommender systems is a relatively new and exciting field that promises a huge potential in the future. It has originated from the field of information retrieval and search engines where the task was: given a query retrieve the most relevant documents. In the recommender system domain, the user should be able to discover items that he/she would not have been able to search for directly. One main challenge in recommender systems is cold-start. It is defined as the situation when a new user/item joins the system. We are interested in item cold start and in this case the recommender system needs to learn about the new item and decide which users should it recommend to. In this work, we propose a new approach to tackle the cold-start problem in recommender system using word embeddings. Word embeddings are semantic representations of the words in a mathematical form like vectors. Embeddings are very useful since they are able to capture the semantic relationship between words in the vocabulary. There are various methods to generate such a mapping which include: neural networks, dimensionality reduction on word co-occurrence matrix, probabilistic models etc. The underlying concept behind these approaches is that words that share common contexts in the corpus have close proximity in the semantic space. Word2vec is a popular technique by Mikolov et al. that has gained tremendous popularity in the natural language processing domain. They came up with two versions, namely: continuous skip-gram and continuous bag-of-words model (CBOW). They were able to overcome the problem of sparsity in text and demonstrate its effectiveness in a wide range of NLP tasks. Our dataset is based on a popular website called Delicious which allows users to store, share and discover bookmarks on the web. For each bookmark, users are able to generate tags that provide meta information about the page such as the topics discussed, important entities. For example, a website about research might contain tags like science, biology, experiment. The problem now becomes: Given a new bookmark with tags, compute which users to recommend this new bookmark. For item cold start situation, a popular technique is to use content based approaches and find items similar to the new item. The new item can then be recommended to users of the computed similar items. In this paper, we propose a method to compute similar items using word embeddings of the tags present for each bookmark. Our methodology involves representing each bookmark as a vector by combining the word embeddings of its tags. There are various possible aggregation mechanisms and we chose to use the average in our experiments since it is intuitive and easy to compute. The similarity between two bookmarks can be computed by taking the cosine similarity between their corresponding embedding vectors. The total number of bookmarks in the dataset is around 70,000 with around 54,000 tags. The embeddings are obtained from the GloVe project where the training is performed on Wikipedia data based on aggregated global word-word co-occurrence statistics. The vocabulary of these embeddings are fairly large, containing about 400 k words and each word is stored in the form of a 300-dimension vector. The results were evaluated manually and the results look promising. We found that the bookmarks recommended were highly relevant in terms of the topics being discussed. Some example topics being discussed in the bookmarks were: social media analysis, movie reviews, vacation planner, web development etc. The reason that embeddings perform well is that they are to capture the semantic information of bookmarks using tags which is useful in cold start situations. Our future work would involve using other aggregation combinations such as weighting the tags differently based on their importance. A more suitable method of evaluation would be to measure the feedback (ratings/engagement) from users in a live recommender system and compare along with other approaches. In this work, we demonstrate the feasibility of using word embeddings to tackle the item cold start problem in recommender systems. This is an important problem that can deliver a positive impact in improving the performance of recommender systems.
-
-
-
Analyze Unstructured Data Patterns for Conceptual Representation
Authors: Aboubakr Aqle, Dena Al-Thani and Ali JaouaOnline news media provides aggregated news and stories from different sources all over the world and up-to-date news coverage. The main goal of this study is to find a solution that is considered as a homogeneous source for the news and to represent the news in a new conceptual framework. Furthermore, the user can easily and quickly find different updated news in a fast way through the designed interface. The Mobile App implementation is based on modeling the multi-level conceptual analysis frame. Discovering main concepts of any domain is captured from the hidden unstructured data that are analyzed by the proposed solution. Concepts are discovered through analyzing data patterns to be structured into a tree-based interface for easy navigation for the end user. Our final experiment results show that analyzing the news before displaying to the end-user and restructuring the final output in a conceptual multilevel structure produces a new display frame for the end user to find the related information of interest.
-
-
-
A Machine Learning Approach for Detecting Mental Stress Based on Biomedical Signal Processing
Authors: Sami Elzeiny and Dr. Marwa QaraqeMental stress occurs when a person perceives abnormal demands or pressures that influence the sense of well-being. These high demands sometimes exceed human capabilities to cope with. Stressors such as workload, inflexible working hours, financial problem, or handling more than one task can cause work-related stress which in turn leads to less productive employees. Lost productivity costs global economy approximately US $ 1 trillion per year [1]. A survey conducted among 7000 workers in U.S. found that 42% had left their job to escape the stressful work environment [2]. Some people can handle stress better than others, therefore the stress symptoms can vary. Stress symptoms can affect the human body and make him down both physically and mentally. Hopelessness, anxiety, and depression are examples of emotional symptoms, while headaches, over-eating, sweaty hands, and dryness of mouth are physical signs of the stress. There are also behavioral cues for stress like aggression, social withdrawal, and loss of concentration [3]. When the thread is perceived, a survival mechanism called «fight or flight response» will be activated to help the human body to adapt the situation quickly. In this mechanism, the central nervous system (CNS) asks adrenal glands to release cortisol and adrenaline hormones, which boost glucose levels in the bloodstream, quicken the heartbeat, and raise blood pressure. If CNS does not succeed to return to normal state, the body reaction will continue which in turn increases the possibility of having heart stroke or attack [4]. There are several techniques used to explore physiological and physical stress measures, for example, electrocardiogram (ECG) measures the heart»s electrical activity, electroencephalography (EEG) records the brain»s electrical activity, electrodermal activity (EDA) or galvanic skin response (GSR) measures the continuous variations in the skin»s electrical characteristics, electromyography (EMG) records electrical activity in muscles, photoplethysmography (PPG) estimates the skin blood flow, and Infrared (IR) tracks eye activities. On the other hand, prolonged ongoing worrying can lead to chronic stress. This type of stress is most harmful and has been linked to cancer, and cardiovascular disease (CVD) [5]. Therefore, several approaches were proposed in an attempt to identify stress triggers and amount of stress. Some of these methods used instruments such as questionnaires to assess affective states, but these techniques usually suffer from memory and response biases. However, stress detection via the analysis of various bio-signals are deemed more valuable and thus have been the focus of modern day research. In particular, various bio-signals are collected from participates. These bio-signals are then subjected to advanced signal processing algorithms in an attempt to extract salient features for classification by machine learning algorithms. In our project, we are interested in exploring new machine learning techniques which wearable devices to record various bio-signals. The goal is the development of an automatic stress detection system based on the analysis of bio-signals through the use of signal processing and machine learning. The outcome of this research will allow users to be notified when their bodies enter a state of unhealthily stress levels so that they may take preventative action to avoid unnecessary consequences.
-
-
-
Inhomogeneous Underwater Visible Light Communications: Performance Evaluation
Authors: Noha Hassan Anous, Mohamed Abdallah and Khalid QaraqeIn this work, an underwater visible light communications (VLC) vertical link performance is evaluated. Underwater environment is known of its inhomogeneous nature versus depth. A mathematical model for the received power (Pr) is derived and bit error rates (BER) are computed under different underwater conditions. A numerical example is given for illustration of the deduced model. Our results suggest that an optimum transmitter-receiver separation exists, where BER is minimum at a certain transmission orientation.
-
-
-
Framework of experiential learning to enhance student engineering skill
Authors: Fadi Ghemri, Houssem Fadi and Abdelaziz BourasIn this research work, we propose a framework of experiential learning to enhance student work skills and experience. This research main to contribute to the development and expansion of local industry, through the conduct of long-term fundamental research that contributes to the science base and understanding needs of national economy through industrial by providing an adapted method, enhance the teaching contents and pedagogical organization to be more accurate and adapted to the competency requirements of local employers.
-
-
-
QCRI's Live Speech Translation System
Authors: Fahim Dalvi, Yifan Zhang, Sameer Khurana, Nadir Durrani, Hassan Sajjad, Ahmed Abdelali, Hamdy Mubarak, Ahmed Ali and Stephan VogelIn this work, we present Qatar Computing Research Institute»s live speech translation system. Our system works with both Arabic and English. It is designed using an array of modern web technologies to capture speech in real time, and transcribe and translate it using state-of-the-art Automatic Speech Recognition (ASR) and Machine Translation (MT) systems. The platform is designed to be useful in a wide variety of situations like lectures, talks and meetings. It is often the case in the Middle East that audiences in talks understand either Arabic or English alone. This system enables the speaker to talk in either language, and the audience to understand what is being spoken even if they are not bilingual.The system consists of three primary modules, i) a Web application, ii) ASR system, iii) and a statistical/neural MT system. The three modules are optimized to work jointly and process the speech at a real-time factor close to one - which means that the systems are optimized to keep up with the speaker and provide the results with a short delay, comparable to what we observe in (human) interpretation. The real-time factor for the entire pipeline is 1.18. The Web application is based on the standard HTML5 WebAudio application programming interface. It captures speech input from a microphone on the user»s device and transmits it to the backend servers for processing. The servers send back the transcriptions and translations of the speech, which is then displayed to the user. Our platform features a way to instantly broadcast live sessions for anyone to see the transcriptions and translations of a session in real-time without being physically present at the speaker»s location. The ASR system is based on KALDI, a state-of-the-art toolkit for speech recognition. We use a combination of time delay neural networks (TDNN) and long-short term memory neural network (LSTM) to ensure real time transcription of the incoming speech while ensuring high quality output. The Arabic and English systems have average word error rates of 23% and 9.7% respectively. The Arabic system consists of the following components: i) a character based lexicon of size 900K; the lexicon maps words to sound units to learn acoustic representation, ii) 40 dimensional high-resolution features extracted for each speech frame to digitize the audio signal, iii) a 100-dimensional i-vectors for each frame to facilitate speaker adaptation, iv) TDNN acoustic models, and v) Tri-gram language model trained using 110 M words, and restricted to 900 K vocabulary.The MT system has two choices for the backend – a statistical phrase-based system and a neural MT system. Our phrase-based system is trained with Moses, a state-of-the-art statistical MT framework, and the neural-based systems is trained with Nematus, a state-of-the-art neural MT framework. We use Modified Moore-Lewis filtering to select the best subset of the available data to train our phrase-based system more efficiently. In order to speed up the translation even further, we prune the language models backing the phrase-based system, ignoring knowledge that is not frequently used. On the other hand, our neural-based system MT system trained on all the available data as its training scales linearly with the amount of data unlike phrase-based systems. Our Neural MT system is roughly 3–5% better on the BLEU scale, a standard measure for computing the quality of translations. However, the existing neural MT decoders are slower than the phrase-based decoders translating 9.5 tokens/second versus 24 tokens/second. The trade-off between efficiency and accuracy barred us from picking only one final system. By enabling both technologies we allow the trade-off between quality and efficiency and leave it up to the user to decide whether they prefer fast or accurate system.Our system has been successfully demonstrated locally and globally at several venues like Al Jazeera, MIT, BBC and TII. The state-of-the-art technologies backing the platform for transcription and translation are also available independently and can be integrated seamlessly into any external platform. The Speech Translation system is publicly available at http://st.qcri.org/demos/livetranslation.
-
-
-
Humans and bots in controversial environments: A closer look at their interactions
Authors: Reham Al Tamime, Richard Giordano and Wendy HallWikipedia is the most influential popular information source on the Internet, and is ranked as the fifth most visited website [1] (Alexa, 2017). The English-language Wikipedia is a prominent source of online health information compared to other providers such as MedlinePlus and NHS Direct (Laurent and Vickers, 2009). Wikipedia has challenged the way that traditional medical encyclopaedia knowledge is built by creating an open sociotechnical environment that allows non-domain experts to contribute to its articles. Also, this sociotechnical environment allows bots – computer scripts that automatically handle repetitive and mundane tasks – to work with humans to develop, improve, maintain and contest information in Wikipedia articles. The contestation in Wikipedia is unavoidable as a consequence of its open nature, which means that it accepts contradictory views on a topic and involves controversies. The objective of this research is to understand the impact of controversy on the relationship between humans and bots in environments that are managed by the crowd. This study analyses all the articles under the WikiProject Medicine, and includes 36,850 Wikipedia articles. Medical articles and their editing history have been harvested from the Wikipedia API covering all edits from 2001 till 2016. The data includes the revisions ID, username, timestamp, and comment. The articles under the WikiProject Medicine contain 6,220,413 edits and around 1,285,936 human and bot editors. To measure controversies, we studied reverted and undone edits. A revert on Wikipedia occurs when an editor, whether human or bot, restores the article to an earlier version after another editor's contribution. Undone edits are reverted single edits from the history of a page, without simultaneously undoing all constructive changes that have been made since the previous edit. Reverted and undone edits that occur systematically indicate controversy and conflict (Tsvetkova et al., 2017). To measure the relationship between humans and bots, we focused on both positive and negative relationships. A positive relationship is when an editor, such as a human, endorses another editor, such as a bot, by reverting or undoing a recent edit to the other editor's contribution. A negative relationship is when an editor, such as human, discards another editor, such as a bot, by reverting or undoing the other editor's contribution. Our results show that there is a relationship between controversial articles and the development of a positive relationship between humans and bots. The results demonstrate that bots and humans could behave differently in controversial environments. The study highlights some of the important features of building health-related knowledge on Wikipedia. The contribution of this work is to build on previous theories that consider web-based systems as social machines. These theories recognise the joint contribution of humans and machines to activities on the web, but assume a very static type of relationships that is not sensitive to the environment in which humans and machines operate in. Understanding the interactions between humans and bots is crucial for designing crowdsourced environments that are integrative to their human and non-human population. We discuss how our findings can help set up future research directions and outline important implications for research on crowd. References: Laurent, M. R & Vickers, T. J (2009) ‘Seeking Health Information Online: Does Wikipedia Matter.?’ J Am Med Inform Assoc, 16(4), 471-479. Tsvetkova, M, García-Gavilanes, R, Floridi, L and Yasseri, T. (2017) ‘Even good bots fight: The case of Wikipedia.’ PLoS One, 12(2): e0171774. [1] https://www.alexa.com/topsites
-
-
-
Nonorthogonal Multiple Access for Visible Light Communications: Complementary Technology Enabling High Data Rate Services for 5G Networks
Authors: Galymzhan Nauryzbayev and Mohamed AbdallahIntroduction: The last decades have been remarkably noticed by an explosive growth of myriad applications in wireless communications, which become an inevitable part of everyday life. Apparently, such services can be characterized by high data content and consequently require high data rates. With respect to the fundamentals of information theory, the data rate at which information can be delivered to the receiver over a wireless channel is strongly linked to the signal-to-noise-ratio (SNR) of the information signal and the corresponding channel bandwidth. These achievements in providing high data rates were mainly obtained at the price of substantially increased bandwidth (Hz) and energy (joules) resources. As a result, a significant spectrum scarcity became a noticeable burden. Moreover, it was shown that exploiting additional RF bandwidth is not anymore a viable solution to meet this high demand for wireless applications, e.g. 5G systems are assumed to provide a 1 Gbps cell-edge data rate and to support data rates of between 10 Gbps and 50 Gbps. To satisfy this demand for more data rates, optical wireless communication (OWC) has been considered as a promising research area. One of these complementary technologies is visible light communication (VLC) technology that has several advantages such as huge non-occupied spectrum, immunity to electromagnetic interference, low infrastructural expenditures, etc. VLC has gained considerable attention as an effective means of transferring data at high rates over short distances, e.g. indoor communications. A typical VLC system consists of a source (light emitting diodes, LEDs) that converts the electrical signal to an optical signal, and a receiver that converts the optical power into electrical current using detectors (photodiodes, PDs). Light beams propagating through the medium deliver the information from the transmitter to the receiver. To satisfy current and future demands for increasing high data rates, the research society has focused on non-orthogonal multiple access (NOMA) regarded as one of the emerging wireless technologies is expected to play an important role in the 5G systems due to its ability to serve many more users utilizing non-orthogonal resource allocation compared to the traditional orthogonal multiple access (OMA) schemes. Therefore, NOMA has been shown to be a promising instrument to improve the spectral efficiency of modern communication systems in combination with other existing technologies. Purpose: This work aims to investigate the performance of the spectrally and energy efficient orthogonal frequency-division multiplexing (SEE-OFDM) based VLC system combined with NOMA approach. We model the system consisting of one transmitter and two receivers located in the indoor environment. Methods: First, we specify the users’ location and estimate the channel state information to determine so-called “near” and “far” users to implement the NOMA approach. Moreover, we assume that the “near” user exploits successive interference cancellation algorithm for interference decoding while the other user treats the interfering signal as noise. Next, we consider two coefficients defining the power portions allocated for the receivers. Then we apply an algorithm to successively demodulate the transmitted signals since each user observes a superposition of the signals designated for both receivers with a predefined target bit-error rate (BER) threshold (10-4). Once the target BER is achieved, we need to estimate the data rate obtainable for a certain set of the power-allocating coefficients. Results: The results show that the indoor SEE-OFDM-based VLC network can be efficiently combined with NOMA, and the target BER can be achieved by both receivers. Moreover, the BER of the “far” user is better since more power is allocated for this user. Next, we evaluate the achievable data rate and compare the results with the ones attainable for the OMA. It can be noticed that the NOMA approach outperforms the results related to the OMA. Conclusions: We analyzed the performance for the two-user indoor VLC network scenario deployed with SEE-OFDM and NOMA techniques. It was shown that recently appeared SEE-OFDM technique can be effectively exploited along with non-orthogonal approach to achieve more spectral efficiency promised by the use of NOMA. Both receivers were shown to be able to achieve the target BER within a narrow range of the power-allocating coefficients. Finally, for the defined system parameters, it was demonstrated that the NOMA approach achieves higher data rates compared to the OMA scenario.
-
-
-
Deep Learning for Traffic Analytics Application FIFA2022
Authors: Abdelkader Baggag, Abdulaziz Yousuf Al-Homaid, Tahar Zanouda and Michael AupetitAs urban data keeps getting bigger, deep learning is coming to play a key role in providing big data predictive analytics solutions. We are interested in developing a new generation of deep learning based computational technologies that predict traffic congestion and crowd management. In this work, we are mainly interested in efficiently predicting future traffic with high accuracy. The proposed deep learning solution allows the revealing of the latent (hidden) structure common to different cities in terms of dynamics. The data driven insights of traffic analytics will help shareholders, e.g., security forces, stadium management teams, and travel agencies, to take fast and reliable decisions to deliver the best possible experience for visitors. Current traffic data sources in Qatar are incomplete as sensors are not yet permanently deployed for data collection.The following topics are being addressed:Predictive Crowd and Vehicles Traffic Analytics: Forecasting the flow of crowds and vehicles is of great importance to traffic management, risk assessment and public safety. It is affected by many complex factors, including spatial and temporal dependencies, infrastructure constraints and external conditions (e.g. weather and events). If one can predict the flow of crowds and vehicles in a region, tragedies can be mitigated or prevented by utilizing emergency mechanisms, such as conducting traffic control, sending out warnings, signaling diversion routes or evacuating people, in advance. We propose a deep-learning-based approach to collectively forecast the flow of crowds and vehicles. Deep models, such as Deep-Neural-Networks, are currently the best data-driven techniques to handle heterogeneous data and to discover and predict complex data patterns such as traffic congestion and crowd movements. We will focus in particular on predicting inflow and outflow of crowds or vehicles to and from important areas, tracking the transitions between these regions. We will study different deep architectures to increase the accuracy of the predictive model, and explore ways on how to integrate spatio-temporal information into these models. We will also study how deep models can be re-used without retraining to handle new data and better scale to large data sets. What-If Scenarios Modeling: Understanding how congestion or overcrowd at one location can cause ripples throughout a transportation network is vital to pinpoint traffic bottlenecks for congestion mitigation or emergency response preparation. We will use predictive modeling to simulate different states of the transportation network enabling the stakeholder to test different hypotheses in advance. We will use the theory of multi-layer networks to model and then simulate the complex relationship between different but coexisting types of flows (crowd, vehicles) and infrastructures (roads, railways, crossings, passageways, squares…). We will propose a visual analytic platform that will provide necessary visual handles to generate different cases, navigate through different scenarios, and identify potential bottleneck, weak points and resilient routes. This visualization platform connected to the real-time predictive analytic platform will allow supporting stakeholder decision by automatically matching the current situation to the already explored scenarios and possible emergency plans. Safety and Evacuation Planning based on Resilience Analytics: Determining the best route to clear congestion or overcrowded areas, or new routes to divert traffic and people from such areas is crucial to maintain high security and safety levels. The visual analytic platform and the predictive model will enable the test and set up of safety and evacuation plans to be applied in case of upcoming emergency as detected by the predictive analytic platform. Overall, the proposed approach is independent of the type of flows, i.e., vehicles or people, or infrastructures, as long as proper sensors (magnetic loops, video camera, GPS tracking, etc.) provide relevant data about these flows (number of people or vehicles per time unit along a route of some layer of the transportation network). The proposed data-driven learning models are efficient, and they adapt to the specificities of the type of flows, by updating the relevant parameters during the training phase.
-
-
-
Virtual Reality Glove for Falconry
More LessFalconry hunting is an Arabic traditional sport and has great significance in Qatari society, as it is a major part of their culture. Falconry is about hunting small birds or animals, by using different types of falcons. The falconry in virtual reality (VR) can preserve Qatari culture by making the sport easy to access for all stream of people. The main idea behind this project is to educate and experience real time falconry to the people living in Qatar and the visitors for 2022 football world cup. The proposed design in our project could also help the professional falconers to use and learn the VR technology that can make them a better handler. Moreover, The rapid development in technologies related to Virtual Reality has made imitation of real world in real time equally possible. The VR environment is possible to be integrated with software mediums such as Unity3D, but to realize the real time feel, weight, pressure, movement, and vibration of any kind in the VR is hard and still under process. There are also various new technologies in this field such as haptics, but these technologies are expensive and there is no definite hardware that actually mimics movement of the falcon when it stands on the hand The main hardware design is a glove that can be detected virtually and can detect movement of different types of falcon on the player's hand. The designs proposed in our project will have extensive real time feel of the falcon on the user's hand using various available hardware components, which are cheap and easy to maintain. The design of our gloves paves way to further enhancement of movement realization in VR for other form of sports, medicine etc., The major requirements for the game of falconry where obtained from Qatari Society of Al-Gannas with whom we have collaboration for this project.
-
-
-
Enabling Efficient Secure Multiparty Computation Development in ANSI C
Authors: Ahmad Musleh, Soha Hussein, Khaled M. Khan and Qutaibah M. MalluhiSecure Multi-Party Computation (SMPC) enables parties to compute a pub- lic function over private inputs. A classical example is the millionaires problem, where two millionaires want to figure out who is wealthier without revealing their actual wealth to each other. The insight gained from the secure computa- tion is nothing more than what is revealed by the output (in this case, who was wealthier but not the actual value of the wealth). Other applications of secure computation include secure voting, on-line bidding and privacy-preserving cloud computations, to name a few. Technological advancements are making secure computations practical, and recent optimizations have made dramatic improve- ments on their performance. However, there is still a need for effective tools that facilitate the development of SMPC applications using standard and famil- iar programming languages and techniques, without requiring the involvement of security experts with special training and background. This work addresses the latter problem by enabling SMPC application de- velopment through programs (or repurposing existing code) written in a stan- dard programming language such as ANSI C. Several high-level language (HLL) platforms have been proposed to enable secure computation such as Obliv-C [1], ObliVM [2] and Frigate [3] These platforms utilize a variation of Yao's garbled circuits [4] in order to evaluate the program securely. The source code written for these frameworks is then converted into a lower-level intermediate language that utilizes garbled circuits for program evaluation. Garbled Circuits have one party (garbler) who compiles the program that the other party (evaluator) runs, and the communication between the two parties happens through oblivi- ous transfer. Garbled circuits allow two parties to do this evaluation without a need for a trusted third party. These frameworks have two common characteristics: either define a new language [2] or make a restricted extension of a current language [1]. This is somewhat prohibitive as it requires the programmer to have a sufficient under- standing of SMPCs related constructs and semantics. This process is error-prone and time-consuming for the programmer. The other characteristic is that they use combinational circuits, which often require creating and materializing the entire circuit (circuit size may be huge) before evaluation. This introduces a restriction on the program being written. TinyGarble [5], however, is a secure two-party computation framework that is based on sequential circuits. Compared with the frameworks mentioned earlier, TinyGarble outperforms them by orders of magnitude. We are developing a framework that can automatically convert a HLL pro- gram (in this case ANSI C) into an hardware definition language, which is then evaluated securely. The benefit of having such transformation is that it does not require knowledge of unfamiliar SMPC constructs and semantics, and per- forms the computation in a much more efficient manner. We are combining the efficiency of sequential circuits for computation as well as the expressiveness of a HLL like ANSI C to be able to develop a secure computation framework that is expected to be effective and efficient. Our proposed approach is two-fold: first, it offers a separation of concern between the function of computation, written in C, and a secure computation policy to be enforced. This leaves the original source code unchanged, and the programmer is only required to specify a policy file where he/she specifies the function/variables which need secure computations. Secondly, it leverages the current state-of-the-art framework to generate sequential circuits. The idea is to covert the original source code to Verilog (a Hardware Definition Language) as this can then be transformed into standard circuit description which TinyGarble [5] would run. This will enable us to leverage TinyGarbles efficient sequential circuits. The result would be having the best of both worlds where we have HLL that would be converted and evaluated as a sequential circuit. References [1] S. Zahur and D. Evans, “Obliv-c: A language for extensible data-oblivious computation,” IACR Cryptology ePrint Archive, vol. 2015, p. 1153, 2015. [2] C. Liu, X. S. Wang, K. Nayak, Y. Huang, and E. Shi, “Oblivm: A pro- gramming framework for secure computation,” in 2015 IEEE Symposium on Security and Privacy, SP 2015, San Jose, CA, USA, May 17-21, 2015, pp. 359–376, 2015. [3] B. Mood, D. Gupta, H. Carter, K. R. B. Butler, and P. Traynor, “Frigate: A validated, extensible, and efficient compiler and interpreter for secure com- putation,” in IEEE European Symposium on Security and Privacy, EuroS&P 2016, Saarbrúcken, Germany, March 21-24, 2016, pp. 112–127, 2016. [4] A. C. Yao, “Protocols for secure computations (extended abstract),” in 23rd Annual Symposium on Foundations of Computer Science, Chicago, Illinois, USA, 3-5 November 1982, pp. 160–164, 1982. [5] E. M. Songhori, S. U. Hussain, A. Sadeghi, T. Schneider, and F. Koushanfar, “Tinygarble: Highly compressed and scalable sequential garbled circuits,” in 2015 IEEE Symposium on Security and Privacy, SP 2015, San Jose, CA, USA, May 17-21, 2015, pp. 411–428, 2015.
-
-
-
Demonstration of DRS: Dynamic Resource Scheduler for Distributed Stream Processing
By Yin YangWe propose to demonstrate DRS, a novel dynamic resource scheduler module for distributed stream processing engines (SPEs). The main idea is to model the system response time as a function of input characteristics, including the volume, velocity, and distribution statistics of the streaming data. Based on this model, DRS decides on the amount of resource to allocate to each streaming operator to the system, so that (i) the system satisfies real-time response constraints at all times and (ii) total resource consumption is minimized. DRS is a key component to enable elasticity in a distributed SPE. DRS is a major outcome of QNRF/NPRP project titled «Real-Time Analytics over Sports Video Streams». As the title suggests, the goal of this project is to analyze sports (especially soccer) videos in real time, using distributed computing techniques. DRS fits the big picture of the project, as it enables dynamic provisioning of computational resources in response to changing data distribution in the input sports video streams. For instance, consider player detection based on region proposals, e.g., using Faster R-CNN. Even though the frame rate of the soccer video stays constant, the number of region proposals can vary drastically and unpredictably (e.g., in one frame there is only one player, and in the next frame there can be all 22 players). Consequently, the workload of the convolutional neural network that performs the detection for each region proposal varies over time. DRS ensures that there are also sufficient resources (e.g., GPUs) for processing the video in real time at any given time point; meanwhile, it avoids over-provisioning by accurately predicting the amount of resource needed. The demo will include both a poster, a video, and a live, on-site demo using a laptop computer connected to a cluster of remote machines. We will demonstrate to the audience how DRS works, when does it change resource allocation plan, how it executes the new allocation, and the underlying model of DRS. Acknowledgement: This publication was made possible by NPRP grant NPRP9-466-1-103 from the Qatar National Research Fund (a member of Qatar Foundation). The findings achieved herein are solely the responsibility of the author[s].
-
-
-
Integration of Multisensor data and Deep Learning for realtime Occupancy Detection for Building Environment Control Strategies
Authors: Dabeeruddin Syed and Amine BermakOne of the most prominent areas of energy consumption in residential units is for heating, ventilation and air-conditioning (HVAC) systems. The conventional systems for HVAC depend on the wired thermostats that are deployed at fixed locations and hence, are not convenient and do not respond to the dynamic nature of the thermal envelope of the buildings. Moreover, it is important to note that the distribution of the spatial temperature is not uniform. The current environment control strategies are based on the maximum occupancy numbers for the building. But there are always certain areas of a building which are used less frequently and are cooled needlessly. Having the real-time occupancy data and mining on it to predict the occupancy patterns of the building will help in energy effective strategy development for the regulation of HVAC systems through a central controller. In this work, we have deployed a network of multiple wireless sensors (humidity, temperature, CO2 sensors etc.), computational elements (in our case, a raspberry pi, to make it cost effective) and camera network with an aim to integrate the data from the multiple sensors in a large multifunction building. The sensors are deployed at multiple locations in such a way that the non-uniform spatial temperature distribution is overcome, and these sensors capture the various environmental conditions at a temporal and much finer spatial granularity. The pi camera is connected to a raspberry pi which is fixed at an elevation. The detection is performed using the OpenCV library and the python programming. This system can detect the occupancy with an accuracy of up to 90%. For occupancy detection and counter, a linear SVM is trained sampling on positive and negative images and the evaluation on test images or video feed makes use of non-maximum suppression (NMS Algorithm) to ignore redundant, overlapping HOG (Histogram Oriented Gradient) boxes. The data collected by the sensors is sent to the central controller on which the video processing algorithm is also running. Using the multiple environmental factors data available to us, models are developed to predict the usage in the building. These models help us to define the control parameters for the HVAC systems in adaptive manner in such a way that these parameters not only help in reducing the energy used in a building, but also help to maintain the thermal comfort. The control parameters are then sent as IR signals to AC systems that are controlled by IR remotes or as wireless signals to AC systems controlled by wireless thermostats. In comparison to the conventional temperature controller, our system will avoid overcooling of areas to save energy and predict the occupancy in the buildings so that the temperature is brought within the comfort zone of humans before over-occupancy takes place. Our system also has benefits of using wireless sensors that operate on low power, but the tradeoff between the power and the communication frequency should be well maintained. Our system additionally has two features: firstly, it can provide the live video streaming for remote monitoring using a web browser for the user interface and secondly, sending automatic notifications as messages in case of anomalies like abnormally high temperatures or high carbon dioxide concentration in a room. These two features can be used as cost-effective replacement for traditional systems in the applications of CCTV, burglary systems respectively. Keywords: wireless sensors, Air conditioning, opencv, NMS Algorithm, Histogram Oriented Gradient, thermal comfort.
-
-
-
Challenges and Success in setting up 3D Printing Lab integrated with EMR and VNA in a Tertiary care hospital in Middle East
Authors: Zafar Iqbal, Sofia Ferreira, Avez Rizvi, Smail Morsi, Fareed Ahmed and Deepak KauraIn recent years’ 3D printing, has shown exponential growth in clinical medicine, research, and education (Carlos et al.). Imaging departments are at the center of 3D printing service delivery, efforts of establishing a 3D printing lab, and making it a unique contribution towards patient care (Kent Thielen et al.). Building a fully electronic medical record (EMR) integrated workflow to deliver 3D services offers unique advantages for clinicians. In Sidra Medicine, we have successfully tested the electronic process by generating 3D orders and delivering the printed models such as hearts, skulls, and mandibles. To facilitate clinicians and 3D printing lab staff, we developed an automatic workflow in our EMR and radiology information system (RIS). Clinicians use our Cerner EMR to order 3D printing services by selecting the available 3D printing orders for each modality i.e. MR, CT, and US. The order also allows them to add their requirements by filling out relevant Order entry fields (OEFs). 3D printing orders populate the RIS worklist for 3D lab staff to start, complete, and document the service process. Consultation with ordering clinicians and radiologists is also vital in 3D printing process, so we developed a message template for the communication between lab staff and clinicians, which also has the capability to attach 3D model PDFs. 3D Lab staff upload the models to our Vendor Neutral Archive (VNA) before completing, storing the models in the patient»s record. Building a 3D workflow in an existing EMR has potential benefits to facilitate the 3D service delivery process. It allows 3D printing to rank amongst other modalities important for patient care by living where all other clinical care orders reside. It also allows 3D Lab staff to document the process through quick communication.
-
-
-
ELearning for person with disabilities
More LessE-learning is the use of Internet technologies or computer-assisted instruction, to enhance and improve knowledge and performance, because knowledge is the basic right for a human right that should be accessible by everyone regardless of the status of they disabilities. E-learning technologies offer learners control over content, learning sequence, pace of learning, time, and often media, allowing them to tailor their experiences to meet their personal learning objectives. this paper explores how adaptive e-learning for person with disabilities focusing intellectual disabilities in the Higher Education (HE) can show the important of making the technology with digital content have accessible to student with disabilities instead of face-to-face education with respect to ‘electronic’ vs. ‘traditional’ learning methods, this way of adapting E-Learning can be considered as its competent substitute complement, and examine the current situation progress in Qatar HE, because ubiquitous technologies have become a positive force of transformation and a crucial element of any personal development/empowerment and institutional framework for inclusive development. Keywords: e-learning, person with disabilities, intellectual disabilities, and learning methods.
-
-
-
Proactive Automated Threat Intelligence Platform
More LessImagine you are sitting at a public coffee shop and using their free wifi on your work device. As long as you are connected to corporate VPN, you are well protected. The moment you go off, you are no longer protected. Your laptop is now open to being hacked and attacked at public wifi locations like airports, hotels, coffee shops, parks etc. My proposed solution involves an automated proactive cloud based threat intelligence platform that will not just monitor and detect threats in real time attacking you while at a public location but also when you are at home. The system works on Zero trust framework where there are no trusted networks or zones. Each system with an IP address has its own Intrusion Detection and Prevention System, combined with special localized malware analysis that is specifically targeting you.Most Anti Virus and anti malware companies, do not write their own signatures. Infact they buy them from smaller companies, my proposed solution will analyze malware targeted at you specifically and create a defensive signature within minutes to neutralize and eradicate threats against you within an hour across your entire infrastructure. There will be no need to wait 2–3 days for Anti virus and anti malware companies to come up with signatures and offer you protection.
-
-
-
A Passive RFID Sense Tag for Switchgear Thermal Monitoring in Power Grid
Authors: Bo Wang and Tsz-Ngai LinI. Background In power grid, circuit breakers in switchgears are often the last line of defense when big systems must be protected from faults [1], sudden switchgear failures could cause long outages, huge economic losses and even present threats to the public safety. Based on field experience, the major causes of switchgear failure are loose or corroded metal connections, degraded cable insulation and external agents (e.g. dust, water) [2]. Due to ohmic loss at these weak points, the causes of switchgear failure are always accompanied with an increased thermal signature over time. With continuous thermal monitoring inside the switchgear, adequate data can be collected to make timely failure predication and prevention, especially for equipment deployed in harsh environment. II. Objective This paper presents the design of a passive radio frequency identification (RFID) sense tag, which measures temperature at critical spots of the switchgear and wirelessly (EPC C1G2 standard) transmits the data to the base station for real-time analysis. Compared with infrared imaging [2], surface acoustic wave (SAW) sensing system or fiber bragg grating (FBG) sensing system [1][3], no cables for power and communication are necessary, which avoids potential side effects like arcing in the grid after the addition of the sensor. The use of standard CMOS process results in a cost effective solution and the proposed passive wireless sensor can be easily retrofitted to existing switchgears with simple bolted connections. III. Passive Tag Design and Measurement Results Fig. 1 shows the proposed passive tag with temperature sensing capability. The power management unit in the chip harvests the incoming RF wave (860 ∼ 960 MHz) and sustains all the other on-chip building blocks (sensor, clock, memory, baseband, modem). The energy harvesting efficiency is in the range of 15%∼25% based on the operating mode of the tag. With 10 uW system power, the effective reading distance of this tag is 4.5 m ∼ 6 m. The on-chip temperature sensor adopts PNP bipolar as the sensing device, which has a temperature sensitivity of ∼2 mV/oC [4]. By using a triple-slope analog-to-digital converter (A/D), temperature-sensitive voltages are digitized and transmitted back to the reader after modulation by the modulator. Because there's no battery or other energy sources on the device, the power consumption of the tag, especially the sensor should be in the order of sub-mA to maintain the tag sensitivity. In this work, a passive integrator instead of an active one is used in the A/D, where its caused nonlinearity error is compensated by adding an additional nonlinear current into the temperature signal. The overall power consumption of the sensor is 0.75 uW and achieves 10 bit sensing resolution (0.12oC/LSB) within 10 ms conversion time, corresponding to a resolution FoM of 1.08x102 pJ√K2, which is among the most energy-efficient embedded sensor designs. Fig.2(a) shows the micro-photo of the designed passive RFID sense tag. Fig.2(b) shows its ceramic package, which can be readily installed on the spot locations with bolted connection. By designing the antenna with an additional ground layer, this tag is able to work in switchgear with full metal enclosure [5]. The measured tag sensitivity is -12 dBm. By measuring and calibrating multiple samples, the overall sensing precision of the tag is +/-1.5oC, which is enough for switchgear thermal monitoring, as shown in Fig.3(a). Thanks to the designed on-chip supply protection circuit, the sensor performance does not degrade much with the reading power or reading distance (received power ∝ or ∝ 1/distance2), as shown in Fig.3(a).IV. Conclusion The combination of passive RFID tags with sensors enables a lot of new applications and it help to bring embedded intelligence to the legacy power grid. The designed passive sense tag is of low-cost and with robust sensing performance after optimizing the tag at the system level and the use of low-power circuit design techniques. By re-designing the tag package, it can also be used in other applications like cold supply chain or highly flammable goods monitoring. Acknowledgement This work is in collaboration with Land Semiconductor Ltd., Hangzhou China and thanks Mr. Qibin Zhu and Mr. Shengzhou Lin for helping the measurement. Reference [1] G.-M. Ma et al., “A Wireless and Passive Online Temperature Monitoring System for GIS Based on Surface-Acoustic-Wave Sensor,” IEEE Trans. on Power Delivery, vol.31, no.3, pp. 1270 - 1280, June 2016. [2] Top Five Switchgear Failure Causes, Netaworld. [Online]. Available: http://www.netaworld.org/sites /default/files/public/neta-journals/NWsu10-NoOutage-Genutis.pdf, accessed Oct. 2017. [3] Fundamentals of Fiber Bragg Grating (FBG) Optical Sensing, NI White Papers. [Online]. Available: http://www.ni.com/white-paper/11821/en/, accessed Oct. 2017. [4] B. Wang, M.-K. Law and A. Bermak, “A Precision CMOS Voltage Reference Exploiting Silicon Bandgap Narrowing Effect,” IEEE Trans. on Electron Devices, vol. 62, no.7, pp.2128-2135, July 2015. [5] Chong Ryol Park and Ki Hwan Eom, “RFID Label Tag Design for Metallic Surface Environments,” Sensors, vol. 11, no. 1, pp. 938 - 948, Jan. 2011.
-
-
-
Coexistence of IEEE 802154g and WLAN
Authors: Saad Mohamed Shalbia, Ridha Hamila and Naofal Al-DhahirThe aging electric grid was established hundred years ago when electricity needs were simple. The power plants were centralized and most homes had only small energy demands such as few lamps and radios. The grid role was to carry electricity from utilities to consumers’ homes. This limited one way interaction makes it difficult for the grid to respond to the sudden changes and higher demands of energy of the 21st century. The smart grid (SG) is a two-way network that allows to exchange electricity and information through the same network between the utility and its customers by installing real time sensors that collect data about ever-changing power consumption. It is an integrated network of communications, automated control, computers, and tools operating together to make the grid more efficient, reliable, secure and greener. SG integrates more technologies such as wind, solar energy and plug-in electric vehicles (PEV). SG will replace the aging electric grid; homes and utilities can better communicate with each other to manage electricity usage by measuring the consumer's consumption instantaneously through a smart meter utilities. As we mentioned before, SG infrastructure enables efficient integration of PEV which may play an important role in balancing SG during critical peak or emergency time by injecting more power to the grid. Two way dialogue facilities service where plug-in hybrids (PHEV) communicate with the grid to obtain information about grid demands whether it will supply the grid with power or charge batteries from the grid. This needs a modern wireless communication network. IEEE 802.15.4g was introduced as a standard for smart utility network (SUN) to enable communication between different parts of SG. IEEE 802.15.4g works in different frequency bands, our work concentrates on 2.4 GHz ISM (Industrial, Scientific and Medical), which is unlicensed band and overcrowded with many devices from other standards, e.g. ZigBee, Bluetooth and wireless local area network (WLAN). The SUN desired signal may overlap with other interfering signal working in the same band, and thus will hinder the receiver's ability to extract the proper signal; this called coexistence problem. Thus, in this contribution the coexistence mechanism is investigated thoroughly in order to improve the performance. SUN has been studied and investigated considering signal attenuation due to path loss in the presence of additive interference and white Gaussian noise at the receiver. The effect of packet length on packet error rate (PER) is researched to find the optimum packet length that achieves the maximum effective throughput for the network in coexistence with the WLAN interfering packets. Though, employing longer packet length results in higher effective throughput, leading to higher PER as many interferers collide with the desired packet. Conversely, using shorter packet length provides lower PER with higher overhead due to packet header and preample, reducing the throughput. Simulation showed that, as signal to interference noise ratio (SINR) increases, longer packet length can be used, to achieve maximum throughput. Moreover, multipath Rayleigh fading channel has also been introduced along minimum mean square error (MMSE) equalization as an interference mitigation technique. Simulation showed that MMSE achieves good performance and improves PER in coexistence of WLAN interfering system.
-
-
-
Data Privacy in Online Social Networks With FineGrained Access Control
Authors: Ahmed Khalil Abdulla and Dr. Spiridon BakirasOnline Social Networks (OSNs), such as Facebook and Twitter, are popular platforms that enable users to interact and socialize through their networked devices. However, the social nature of such applications forces users to share a great amount of personal data with other users and the OSN service providers, including pictures, location check-ins, etc. Even though some OSNs offer configurable privacy controls that limit access to shared data, users might misconfigure these controls due to their complexity or lack of clear instructions. Furthermore, the fact that OSN service providers have full access over the data stored on their servers is an alarming thought, especially for users who are conscious about their privacy. For example, OSNs might share such data with third parties, data mine them for targeted advertisements, collect statistics, etc. As a result, data and communication privacy over OSNs is a popular topic in the data privacy research community. Existing solutions include cryptographic mechanisms [1], trusted third parties [2], external dictionaries [3], and steganographic techniques [4]. Nevertheless, none of the aforementioned approaches offers a comprehensive solution that (i) implements fine-grained access control over encrypted data and (ii) works seamlessly over existing OSN platforms. To this end, we will design and implement a flexible and user-friendly system that leverages encryption-based access control and allows users to assign arbitrary decryption privileges to every data object that is posted on the OSN servers. The decryption privileges can be assigned on the finest granularity level, for example, to a hand-picked group of users. In addition, data decryption is performed automatically at the application layer, thus enhancing the overall experience for the end-user. Our cryptographic-based solution leverages hidden vector encryption (HVE)[5], which is a ciphertext policy-based access control mechanism. Under HVE, each user generates his/her own master key (one-time) that is subsequently used to generate a unique decryption key for every user with whom they share a link in the underlying social graph. Moreover, during the encryption process, the user interactively selects a list of friends and/or groups that will be granted decryption privileges for that particular data object. To distribute the decryption keys, we utilize an untrusted database server where users have to register before using our system. The server stores (i) the social relationships of the registered users, (ii) their public keys, and (iii) the HVE decryption keys assigned to each user. As the database server is untrusted, the decryption keys are stored in encrypted form, i.e., they are encrypted with the public key of the underlying user. Therefore, our solution relies on the existing public key infrastructure (PKI) to ensure the integrity and authenticity of the users’ public keys. To facilitate the deployment of our system over existing OSN platforms, we use steganographic techniques [6] to hide the encrypted data objects within randomly chosen cover images (stego images). The stego images are then uploaded to the OSN servers, and only authorized users (with the correct decryption keys) would be able to extract the embedded data. Unauthorized users will simply see the random cover images. We aim to implement our system as a Chrome-based browser extension where, after installation, the user registers with the un- trusted server and uploads/downloads the necessary decryption keys. The keys are also stored locally, in order to provide a user-friendly interface to share private information. Specifically, our system will offer a seamless decryption process, where all hidden data objects are displayed automatically while surfing the OSN platform, without any user interaction. References [1] S. Jahid, P. Mittal, and N. Borisov, “EASiER: encryption-based access control in social networks with efficient revocation,” in Proc. ACM Symposium on Information, Computer and Communications Security (ASIACCS), pp. 411–415, 2011.[2] A. Tootoonchian, S. Saroiu, Y. Ganjali, and A. Wolman, “Lockr: better privacy for social networks,” in Proceedings of the 2009 ACM Conference on Emerging Networking Exper- iments and Technology, CoNEXT 2009, Rome, Italy, December 1-4, 2009, pp. 169–180, 2009.[3] S. Guha, K. Tang, and P. Francis, “NOYB: privacy in online social networks,” in Proc. Workshop on Online Social Networks (WOSN), pp. 49–54, 2008.[4] J. Ning, I. Singh, H. V. Madhyastha, S. V. Krishnamurthy, G. Cao, and P. Mohapatra, “Secret message sharing using online social media,” in Proc. IEEE Conference on Commu- nications and Network Security (CNS), pp. 319–327, 2014.[5] T. V. X. Phuong, G. Yang, and W. Susilo, “Efficient hidden vector encryption with constant- size ciphertext,” in Proc. European Symposium on Research in Computer Security (ES- ORICS), pp. 472–487, 2014.[6] S. Kaur, S. Bansal, and R. K. Bansal, “Steganography and classification of image steganog- raphy techniques,” in Proc. International Conference on Computing for Sustainable Global Development (INDIACom), 2014.
-
-
-
Resilient Output Feedback Control of Cyberphysical Systems
By Nader MeskinCyber-physical system architectures are being used in many different applications such as power systems, transportation systems, process control systems, large-scale manufacturing systems, ecological systems, and health-care systems. Many of these applications involve safetycritical systems, and hence, any failures or cyber attacks can cause catastrophic damage to the physical system being controlled resulting in drastic societal ramifications. Due to the open communication and computation platform architectures of CPS, one of th most important challenges in these systems is their vulnerability to malicious cyber attacks. Cyber attacks can severely compromise system stability, performance, and integrity. In particular, malicious attacks in feedback control systems can compromise sensor measurements as well as actuator commands to severely degrade closed-loop system performance and integrity. Cyber attacks are continuously becoming more sophisticated and intelligent, and hence, it is vital to develop algorithms that can suppress their effects on cyber-physical systems.In this paper, an output feedback adaptive control architecture is presented to suppress or counteract the effect of false data injection actuator attacks in linear systems, where it is assumed that the attacker is capable of maliciously manipulating the controller commands to the actuators. In particular, the proposed controller is composed of two components, namely anominal controller and an additive corrective signal. It is assumed that the nominal controller has been already designed and implemented to achieve a desired closed-loop nominal performance. Using the nominal controller, an additive adaptive corrective signal is designed and added to the output of the nominal controller in order to suppress the effect of the actuator attacks. Thus, in the proposed control architecture, there is no need to redesign the nominal controller; only the adaptive corrective signal is designed using the available information from the nominal controller and the system.
-
-
-
On Dependability Traffic Load and Energy Consumption Tradeoff in Data Center Networks
Authors: Zina Chkirbene, Ala Gouissem, Ridha Hamila and Sebti FoufouMega data centers (DCs) are considered as efficient and promising infrastructures for supporting numerous cloud computing services such as online office, online social networking, Web search and IT infrastructure out-sourcing. The scalability of these services is influenced by the performance and dependability characteristics of the DCs. Consequently, the DC networks are constructed with a large number of network devices and links in order to achieve high performance and reliability. As a result, these requirements increase the energy consumption in DCs. In fact, in 2010, the total energy consumed by DCs was estimated to be about 120 billion Kilowatts of electricity in 2012, which is about 2.8% of the total electricity bill in the USA. According to industry estimates, the USA data center market achieved almost US 39 billion in 2009, growing from US 16.2 billion in 2005. One of the primary reasons behind this issue is that all the links and devices are always powered on regardless of the traffic status. The statistics showed that the traffic drastically alternates, especially between mornings and nights, and also between working days and weekends. Thus, the network utilization depends on the actual period, and generally, the peak capacity of the network is reached only in rush times. This non-proportionality between traffic load and energy consumption is caused by the fact that -most of the time- only a subset of the network devices and links can be enough to forward the data packets to their destinations while the remaining idle nodes are just wasting energy. Such observations inspired us to propose a new approach that powers off the unused links by deactivating the end-ports of each one of them to save energy. The deactivation of ports is proposed in many researches. However, these solutions have high computational complexity, network delay and reduced network reliability. In this paper, we propose a new approach to reduce the power consumption in DC. By exploiting the correlation in time of the network traffic, the proposed approach uses the traffic matrix of the current network state, and manages the state of switch ports (on/off) at the beginning of each period, while making sure to keep the data center fully connected. During the rest of each time period, the network must be able to forward its traffic through the active ports. The decision to close or open depends on a predefined threshold value; the port is closed only if the sum of the traffic generated by its connected node is less than the threshold. We also investigate the minimum period of time during which a port should not change its status. This minimum period is necessary given that it takes time and energy to switch a port on and off. Also, one of the major challenges in this work is powering off the idle devices for more energy saving while guaranteeing the connectivity of each server. So, we propose a new traffic aware algorithm that presents a tradeoff between energy saving and reliability satisfaction. For instance, in HyperFlatNet, simulation results show that the proposed approach reduces the energy consumption by 1.8*104 WU (Watt per unit of time) for a correlated network with1000-server (38 % of energy saving). In addition, and thanks to the proposed traffic aware algorithm, the new approach shows a good performance even in case of high failure rate (up to 30%) which means when one third of the links failed, the connection failure rate is only 0.7%. Both theoretical analysis and simulation experiments are conducted to evaluate and verify the performance of the proposed approach compared to the state-of-the-art techniques.
-
-
-
Visualization of Electricity Consumption in Qatar
Authors: Salma Tarek Shalaby, Engy Khaled Soliman and Noora FetaisThe amount of raw data related to electricity consumption is increasing rapidly with the increase of construction sites, population and Qatar preparation for 2022 world cup. By this increase, managers will find difficulties in studying the data and keeping track of the consumption. Thus, taking actions and future planning will be a hard task and the decisions taken might not be beneficial because of the miss understanding of data. In this project, a customized web application is developed to visualize the data on an interactive map. The idea behind the project is to help decision makers to take actions in an efficient and easy way based on the data visualized thus, it supports Qatar's 2030 vision for saving time and electricity. Instead of reading big tables with huge incomprehensible numbers, the application easily visualizes the average consumption on the map. It also provides different chart types to help the user in comparing the data and consequently take the right decision. The rapid increase of data challenges the ability of using such data in decision-making, the challenge also extends to the ability of avoiding the risk of getting lost in these big numbers. Reading such data and trying to analyze it could be wasteful in terms of time and money. Moreover, it could cut down industrial and scientific opportunities. The current solution in Qatar for electric consumption analysis is using Microsoft Excel. The stakeholders only use professional software for operational purposes, but not for analyzing the data. As a result, they are going to see what they asked for only and they would waste any opportunity for deeper insight into these data. Visual analytics is a powerful tool to visualize and transparent processes to provide a means of communicating about them rather than providing results. Data visualization is an effective tool for communication regardless of the communicators’ expertise. It is also viewed as an important analytical tool for effective research communication. It is not limited to the display of raw data sets, but rather all static and interactive visual representations of research data, which may include interactive charts, queries, etc. Combining the visualization aspect with the analytics of big data will significantly help resolving the problem of reading the electricity consumption data. One of the project's goals is to improve the readability of data insights and unlock the ultimate power of data visualization; the data presentation element is where alternative representations will be used to test the persuasive power of visualization. The project aims to make data understandable using data visualization techniques. It provides several features such as an interactive map that indicates the average consumption. The zooming levels are divided into three levels: 1) The Whole Country 2) Municipalities 3) Areas level. In addition, the data is visualized using different graph types: line graphs, pie charts, bar charts and others. This helps the managers and decision makers to effectively analyze the data and compare between different areas and the average consumption through years. Furthermore, it provides different utilities such as emailing the results, printing, saving and showing the table.
-
-
-
Saffara: Intelligent queuing application for improving clinical workflow
This paper examines the impact on patient experience through the creation of a bespoke patient queuing and communication application using in-house developed technologies. Sidra Medicine hospital's outpatient pharmacy was experiencing mismanaged queue lines, dissatisfied patients, and the lack of data necessary to determine the length of time elapsing in obtaining medication. After analyzing patient surveys through the method of sentiment analysis and generation of word clouds, we validated that there was scope for workflow improvement in the pharmacy department. The Center for Medical Innovation, Software, and Technology (CMIST) department was commissioned to develop the software application necessary to deliver efficiency and improvement in response to the lack of a queuing and communication system. The use of an in-house development team to create an application for queuing and communication as opposed to selecting a popular vendor software resulted in many advantages. Some of the main advantages were that the requirements of pharmacy were delivered through rapid customization and in multiple iterations, which were delivered in response to the ever changing customer demand. By using scrum methodology, the team was able to deliver the application called Saffara, for managing queues in the pharmacy and improving patient experience while obtaining medication. The Saffara application, has a unique feature of being integrated to the hospital EMR (Electronic Medical Record) system while ensuring confidentiality, efficiency and time saving. The application integrates with the hospital's EMR to obtain patient information, appointment times and prescribed medication. This integration allowed for the identification of patients' progress and calculation of patients ‘wait times. Patients are automatically notified when their medication is ready for collection, through system generated SMS texts. The application also utilizes a notification display for communication with patients as part of our business continuity procedure. In addition to notifying the patient, the Saffara application also generates detailed analytical reports for each hour and for each patient, which allows us to analyze the bottlenecks in the clinical workflow. We present these technologies to any stakeholders through a web dashboard and detailed web-based reports in our application. The pharmacy stakeholders, i.e., the pharmacy management team utilize the dashboards and quantitative data in the reports to predict staffing levels to deliver optimization in patient medication delivery. In this paper, we present the methods we use to calculate the useful analytics like patient wait times across different stages in the workflow and hourly breakdown of patients being served. We will also discuss how we reduced patient wait times by adding unique features to a queuing application like automation of steps in the pharmacy workflow through generation of patient identifiers and automatic ticket tracking. This paper will also highlight how we are scaling our application from pharmacy to all clinics of the hospital. The goal of the application is to provide a consistent experience for patients in all clinics as well as a consistent way for staff to gather and analyze data for workflow improvement. Our future work is to explore how we can use machine learning to identify the parameters that play a vital role in wait times as well as patient experience. The objective of this paper is to highlight how our technology converges the patient experience and staff workflow enhancements to deliver improvement in a clinical workflow setting.
-
-
-
Advances in Databased Process Monitoring and Applications
Authors: M. Ziyan Sheriff, hazem Nounou, M. Nazmul Karim and Mohamed NounouMany processes utilize statistical process monitoring (SPM) methods in order to ensure that process safety and product quality is maintained. Principal Component Analysis (PCA) is a data-based modeling and fault detection technique that it widely used by the industry [1]. PCA is a dimensionality reduction technique that transforms multivariate data into a new set of variables, called principal components, which capture most of the variations in the data in a small number of variables. This work examines different improved PCA-based monitoring techniques, discusses their advantages and drawbacks, and also provides solutions to address the issues faced by these techniques. Most data based monitoring techniques are known to rely on three fundamental assumptions: that fault-free data are not contaminated with excessive noise, are decorrelated (independent), and follow a normal distribution [2]. However, in reality, most processes may violate one or more of these assumptions. Multiscale wavelet-based data representation is a powerful data analysis tool that utilizes wavelet coefficients which are known to possess characteristics that are inherently able to satisfy these assumptions as they are able to denoise data, force data to follow a normal distribution and be decorrelated at multiple scales. Multiscale representation has been utilized to develop a multiscale principal component analysis (MSPCA) method for improved fault detection [3]. In a previous work, we also analyzed the performance of multiscale charts under violation of the main assumptions, demonstrating that multiscale methods do provide lower missed detection rates, and ARL1 values when compared to conventional charts, with comparable false alarm rates [2]. The choice of wavelet to use, the choice of decomposition depth, and Gibb's phenomenon are a few issues faced by multiscale representation, and these will be discussed in this work. Another common drawback of most conventional monitoring techniques used in the industry is that they are only capable of efficiently handling linear data [4]. The kernel principal component analysis (KPCA) method is a simple improvement to the PCA model that enables nonlinear data to be handled. KPCA relies on transforming data from the time domain to a higher dimensional space where linear relationships can be drawn, making PCA applicable [5]. From a fault detection standpoint KPCA suffers from a few issues that require discussion, i.e., the importance of the choice of kernel utilized, the kernel parameters, and the procedures required to bring the data back to the time domain, also known as the pre-image problem in literature [6]. Therefore, this work also provides a discussion on these concerns. Recently, literature has shown hypothesis testing methods, such as the Generalized Likelihood Ratio (GLR) charts can provide improved fault detection performance [7]. This is accomplished by utilizing a window length of previous observations in order to compute the maximum likelihood estimates (MLEs) for the mean and variance, which are then utilized to maximize the likelihood functions in order to detect shifts in the mean and variance [8], [9]. Although, utilizing a larger window length of data to compute the MLEs has shown to reduce the missed detection rate, and ARL1 values, the larger window length increases both the false alarm rate and the computational time required for the GLR statistic. Therefore, an approach to select the window length parameter keeping all fault detection criterion in mind is required, which will be presented and discussed. The individual techniques described above have their own advantages and limitations. Another goal of this work is to develop new algorithms, through the efficient combination of the different SPM techniques, to improve fault detection performances. Illustrative examples using real world applications will be presented in order to demonstrate the performances of the developed techniques as well as their applicability in practice. References [1] I. T. Joliffe, Principal Component Analysis, 2nd ed. New York, NY: Springer-Verlag, 2002. [2] M. Z. Sheriff and M. N. Nounou, “Improved fault detection and process safety using multiscale Shewhart charts,” J. Chem. Eng. Process Technol., vol. 8, no. 2, pp. 1–16, 2017. [3] B. Bakshi, “Multiscale PCA with application to multivariate statistical process monitoring,” AIChE J., vol. 44, no. 7, pp. 1596–1610, Jul. 1998. [4] M. Z. Sheriff, C. Botre, M. Mansouri, H. Nounou, M. Nounou, and M. N. Karim, “Process Monitoring Using Data-Based Fault Detection Techniques: Comparative Studies,” in Fault Diagnosis and Detection, InTech, 2017. [5] J.-M. Lee, C. Yoo, S. W. Choi, P. a. Vanrolleghem, and I.-B. Lee, “Nonlinear process monitoring using kernel principal component analysis,” Chem. Eng. Sci., vol. 59, no. 1, pp. 223–234, 2004. [6] G. H. BakIr, J. Weston, and B. Schölkopf, “Learning to Find Pre-Images,” Adv. neural Inf. Process. Syst. 16, no. iii, pp. 449–456, 2004. [7] M. Z. Sheriff, M. Mansouri, M. N. Karim, H. Nounou, and M. Nounou, “Fault detection using multiscale PCA-based moving window GLRT,” J. Process Control, vol. 54, pp. 47–64, Jun. 2017. [8] M. R. Reynolds and J. Y. Lou, “An Evaluation of a GLR Control Chart for Monitoring the Process Mean,” J. Qual. Technol., vol. 42, no. 3, pp. 287–310, 2010. [9] M. R. Reynolds, J. Lou, J. Lee, and S. A. I. Wang, “The Design of GLR Control Charts for Monitoring the Process Mean and Variance,” vol. 45, no. 1, pp. 34–60, 2013.
-
-
-
Haralick feature extraction from timefrequency images for automatic detection and classification of audio anomalies for road surveillance
More LessIn this paper, we propose a novel method for the detection of road accidents by analyzing audio streams for road surveillance application. In the last decade, due to the increase in a number of people and transportation vehicles, traffic accidents have become one of a major public issue worldwide. The vast number of injuries and death due to road traffic accident reveals the story of a global crisis of road safety. A total of 52,160 road traffic accidents (RTA), 1130 injuries and 85 fatalities were registered during the year 2000, in the state of Qatar. An increase in the number of transportation vehicles around cities has risen the need for more security and safety in public environments. The most obvious reason for a person's death during accidents is the absence or prolong response time of the first aid facility, which is due to the delay in the information of the accident being reached to the police, hospital or ambulance team. In the last couple of years, several surveillance systems have been proposed based on image and video processing for automatically detecting road accidents and car crashes to ensure a quick response by emergency teams. However, in some situations such as adverse weather conditions or cluttered environments, the visual information is not sufficient enough, whereas analyzing the audio tracks can significantly improve the overall reliability of surveillance systems. In this paper we propose a novel method that automatically identifies hazardous situations such as tire skidding and car crashes in presence of background noises, by analyzing the audio streams. Previous studies show that methods for the detection, estimation, and classification of nonstationary signals can be enhanced by utilizing the time-frequency (TF) characteristics of such signals. The TF-based techniques have been proved to outperform classical techniques based on either time or frequency domains, in analyzing real-life nonstationary signals. Time-frequency distributions (TFDs) give additional information about signals that cannot be extracted from either the time domain or frequency domain representations, i.e. the instantaneous frequency of the components of a signal. In order to utilize this extra information provided by TF domain representation, the proposed approach extracts TF image features from quadratic time-frequency distributions (QTFDs), for the detection of audio anomalies. The extended modified-B distribution (EMBD) is utilized to transform a 1-dimensional audio signal into 2-dimensional TF representation which is interpreted as an image. The image descriptors based features are then extracted from the TF representation to classify visually the audio signals into background or abnormal activity patterns in the TF domain. The proposed features are based on Haralick's texture features extracted from the TF representation of audio signals considered and processed as a textured image. These features are used to characterize and hence classify audio signals into M classes. This research study demonstrates that a TF image pattern recognition approach offers significant advantages over standard signal classification methods that utilize either t-domain only or f-domain only features. The proposed method has been experimentally validated on a large open source database of sounds, including several kinds of background noise. The events to be recognized are superimposed on different background sounds of roads and traffic jam. The obtained results are compared with a recent study, utilizing the same large and complex data set of audio signals, and the same experimental setup. The overall classification results confirm the superior performance of the proposed approach with accuracy improvement up to 6%.
-
-
-
Annotation Guidelines for Text Analytics in Social Media
Authors: Wajdi Zaghouani and Anis CharfiAnnotation Guidelines for Text Analytics in Social Media A person's language use reveals much about their profile, however, research in author profiling has always been constrained by the limited availability of training data, since collecting textual data with the appropriate meta-data requires a large collection and annotation effort (Maamouri et al. 2010; Diab et al. 2008; Hawwari et al. 2013).For every text, the characteristics of the author have to be known in order to successfully profile the author. Moreover, when the text is written in a dialectal variety such as the Arabic text found online in social media a representative dataset need to be available for each dialectal variety (Zaghouani et al. 2012; Zaghouani et al. 2016).The existing Arabic dialects are historically related to the classical Arabic and they co-exist with the Modern Standard Arabic in a diglossic relation. While the standard Arabic, has a clearly defined set of orthographic standards, the various Arabic dialects have no official orthographies and a given word could be written in multiple ways in different Arabic dialects (Maamouri et al. 2012; Jeblee et al. 2014).This abstract presents the guidelines and annotation work carried out within the framework of the Arabic Author profiling project (ARAP), a project that aims at developing author profiling resources and tools for a set of 12 regional Arabic dialects. We harvested our data from social media which reflect a natural and spontaneous writing style in dialectal Arabic from users in different regions of the Arabworld.For the Arabic language and its dialectal varieties as foundin social media, to the best of our knowledge, there is nocorpus available for the detection of age, gender, nativelanguage and dialectal variety. Most of the existingresources are available for English or other Europeanlanguages. Having a large amount of annotated data remains the key to reliable results in the taskof author profiling. In order to start the annotation process, we createdguidelines for the annotation of the Tweets according totheir dialectal variety, their native language, the gender of the user and the age. Before starting theannotation process, we hired and trained a group of annotators and we implemented a smooth annotation pipeline to optimize the annotation task. Finally, we followed a consistent annotation evaluation protocol to ensure a high inter-annotator agreement.The Annotations were done by carefully analyzing each ofthe user's profiles, their tweets, and when possible, weinstructed the annotators to use external resources such asLinkedIn or Facebook. We created a general profilesvalidation guidelines and task-specific guidelines toannotate the users according to their gender, age, dialectand their native language. For some accounts, the annotators were not able to identifythe gender as this was based in most of the cases on thename of the person or his profile photo and in some casesby their biography or profile description. In case thisinformation is not available, we instructed the annotators toread the user posts and find linguistic indicators of thegender of the user.Like many other languages, Arabic conjugates verbsthrough numerous prefixes and suffixes and the gender issometimes clearly marked such as in the case of the verbsending in taa marbuTa which is usually of femininegender.In order to annotate the users for their age, we used threecategories: under 20 years, between 20 years and 40 years,and 40 years and up.In our guidelines, we asked our annotators to try their bestto annotate the exact age, for example, they can check theeducation history of the users in LinkedIn and Facebookprofile and find when the graduated from high school forexample in order to guess the age of the users. As the dialect and the regions are known in advance to theannotators, we instructed them to double check and markthe cases when the profile appears to be from a differentdialect group. This is possible despite our initial filteringbased on distinctive regional keywords. We noticed that inmore than 90% the profiles selected belong to the specifieddialect group. Moreover, we asked the annotators to mark and identifyTwitter profiles with a native language other than Arabic,so they are considered as Arabic L2 speakers. In order tohelp the annotators identify those, we instructed them tolook for various cues such as the writing style, the sentence structure, the word order and the spelling errors.AcknowledgementsThis publication was made possible by NPRP grant #9-175-1-033 from the Qatar National Research Fund (a member ofQatar Foundation). The statements made herein are solelythe responsibility of the authors. ReferencesDiab Mona, Aous Mansouri, Martha Palmer, Olga Babko-Malaya, Wajdi Zaghouani, Ann Bies, Mohammed Maamouri. A Pilot Arabic Propbank; LREC 2008, Marrakech, Morocco, May 28-30, 2008.Hawwari, A.; Zaghouani, W.; O»Gorman, T.; Badran, A.; Diab, M., «Building a Lexical Semantic Resource for Arabic Morphological Patterns,» Communications, Signal Processing, and their Applications (ICCSPA), 2013, vol., no., pp.1,6, 12-14 Feb. 2013. Jeblee Serena; Houda Bouamor; Wajdi Zaghouani; Kemal Oflazer. CMUQ@QALB-2014: An SMT-based System for Automatic Arabic Error Correction. In Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), Doha, Qatar, October 2014.Maamouri Mohamed, Ann Bies, Seth Kulick, Wajdi Zaghouani, Dave Graff and Mike Ciul. 2010. From Speech to Trees: Applying Treebank Annotation to Arabic Broadcast News. In Proceedings of LREC 2010, Valetta, Malta, May 17-23, 2010.Maamouri Mohammed, Wajdi Zaghouani, Violetta Cavalli-Sforza, Dave Graff and Mike Ciul. 2012. Developing ARET: An NLP-based Educational Tool Set for Arabic Reading Enhancement. In Proceedings of The 7th Workshop on Innovative Use of NLP for Building Educational Applications, NAACL-HLT 2012, Montreal, Canada.Obeid Ossama, Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Kemal Oflazer and Nadi Tomeh. A Web-based Annotation Framework For Large- Scale Text Correction. In Proceedings of IJCNLP'2013, Nagoya, Japan.Zaghouani Wajdi, Nizar Habash, Ossama Obeid, Behrang Mohit, Houda Bouamor, Kemal Oflazer. 2016. Building an arabic machine translation post-edited corpus: Guidelines and annotation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC»2016).Zaghouani Wajdi, Abdelati Hawwari and Mona Diab. 2012. A Pilot PropBank Annotation for Quranic Arabic. In Proceedings of the first workshop on Computational Linguistics for Literature, NAACL-HLT 2012, Montreal, Canada.
-
-
-
Towards OpenDomain CrossLanguage Question Answering
Authors: Ines Abbes, Alberto Barrón-Cedeño and Mohamed JemniWe present MATQAM (Multilingual Answer Triggering Question Answering Machine) a multilingual answer triggering open-domain QA system, focusing on answering questions whose answers might be in free texts in multiple languages within Wikipedia.Obtaining relevant information from the Web has become more challenging, since online communities and social media tend to confine people to bounded trends and ways of thinking. Due to the large amount of data available, getting the relevant information has become a more challenging task. Unlike in standard Information Retrieval (IR), Question Answering (QA) systems aim at retrieving the relevant answer(s) to a question expressed in natural language, instead of returning a list of documents. On the one hand, information is dispersed in different languages and needs to be gathered to get more knowledge. On the other hand, extracting answers from multilingual documents is a complicated task because natural languages follow diverse linguistic syntaxes and rules, especially for Semitic languages, such as Arabic. This project tackles open-domain QA using Wikipedia as source of knowledge by building a multilingual —Arabic, French, English— QA system. In order to obtain a collection of Wikipedia articles as well as questions in multiple languages, we extended an existing English dataset: WikiQA (Yang et al., 2015). We used the WikiTailor toolkit (Barrón-Cedeño et al., 2015) to build a comparable corpus form Wikipedia articles and to extract the corresponding articles in Arabic, French, and English. We used neural machine translation to generate the questions in the three languages as well. Our QA system consists of the three following modules. (i) Question processing consists of transforming a natural language question into a query and determining the expected type of the answer in order to define the retrieval mechanism for the extraction function. (ii) The document retrieval module consists of retrieving the most relevant documents from the search engines —in multiple languages— given the produced query. The purpose of this module is to identify the documents that may contain an answer to the question. It requires cross-language representations as well as machine translation technology to do that, as the question could be asked in Arabic, French or English and the answer could be in either of these languages. (iii) The answer identification module ranks specific text fragments that are plausible answers to the question. It first ranks the candidate text fragments in the different languages and, if they are found, they are combined into one consolidated answer. This is a variation of the cross-language QA scenario enabling answer triggering, where no concrete answer has to be provided, if it does not exist. In order to build our QA system, we extend an existing framework (Rücklé and Gurevych, 2017) integrating neural networks for answer selection. References Alberto Barrón-Cedeño, Cristina España Bonet, Josu Boldoba Trapote, and Luís Márquez Villodre. A Factory of Comparable Corpora from Wikipedia. In Proceedings of the Eighth Workshop on Building and Using Comparable Corpora, pages 3–13, Beijing, China, 2015. Association for Computational Linguistics. Andreas Rücklé and Iryna Gurevych. End-to-End Non-Factoid Question Answering with an Interactive Visualization of Neural Attention Weights. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations (ACL 2017), pages 19–24, Vancouver, Canada, August 2017. Association for Computational Linguistics. doi:10.18653/v1/P17-4004. URL http://aclweb.org/anthology/P17-4004. Yi Yang, Wen-tau Yih, and Christopher Meek. WikiQA: A Challenge Dataset for Open-Domain Question Answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 2013–2018, Lisbon, Portugal, 2015.
-
-
-
Toward a Cognitive Evaluation Approach for Machine Translation PostEditing
Authors: Wajdi Zaghouani and Irina TemnikovaMachine Translation (MT) today is used more and more by professional translators, including freelancers, companies, and official organisations, such as, for example, the European Parliament. MT output, especially of publicly available MT engines, such as Google Translate, is, however, well known to contain errors and lack fluency from human expectations» point of view. For this reason, the MT translated texts often need manual (or automatic) corrections, known as `Post-Editing» (PE).Although there are fast and simple measures of post-editing cost, such as time to post-edit, or edit-distance, these measures do not reflect the cognitive difficulty involved in correcting the specific errors in the MT output text. As the MT output texts can be of different quality and thus contain errors of different difficulty to be corrected, fair compensation of post-editing should take into account the difficulty of the task, which should thus be measured in the most reliable way. The best solution for this would be to build an automatic classifier which (a) assigns each MT error into a specific correction class, (b) assigns an effort value which reflects the cognitive effort a post-editor needs to make in order to make such a correction, and (c) gives a post-editing effort score to a text. On our way of building such a classifier, we investigate whether an existing cognitive effort model could provide a fairer compensation for the post-editor, by testing it on a new language which strongly differs from the previous languages on which this methodology was tested.The model made use of the Statistical Machine Translation (SMT) error classification schema from which the error classes were subsequently re-grouped and ranked in an increasing order, so as to reflect the cognitive load post-editors experience while correcting the MT output. Error re-grouping and ranking was done on the basis of relevant psycholinguistic error correction literature. The aim of proposing such an approach was to create a better metric for the effort a post-editor faces while correcting MT texts, instead of relying on a non-transparent MT evaluation score such as BLEU.The approach does not rely on using specific software, in contrast to PE cognitive evaluation approaches which are based on keystroke logging or eye-tracking. Furthermore, the approach is more objective than the approaches which rely on human scores for perceived post-editing effort. In its essence, it is similar to other error classification approaches. It is enriched, however by error ranking, based on information specifying which errors require more cognitive effort to be corrected, and which less. In this way, the approach only requires counting the number of errors of each type in the MT output. And thus it allows the comparison of the post-editing cost of different output texts of the same MT engine, the same text as an output of different MT engines, or for different language pairs.Temnikova et al. (2010) tested her approach on two emergency instructions texts, one original (called `Complex») and one manually simplified (called ``Simplified»»), according to Controlled Language (CL) text simplification rules. Both texts were translated using the web version of Google Translate into three languages: Russian, Spanish, and Bulgarian. The MT output was manually post-edited by 3-5 human translators per language and then the number of errors per category was manually counted by one annotator per language.Several researchers based their work on Temnikova»s cognitive evaluation approach. Among them, Koponen et al. (2012) have modified the error classification by adding one additional class.Lacruz and Munoz et al. (2014) enriched our original error ranking/classification with numerical weights from 1 to 9, which showed a good correlation with another metric they used (Pause to Word Ratio), but did not normalize the scores per text length. The weights were added to form a unique score for each text called Mental Load (ML).The current work presented in this abstract makes the following contributions, compared to our previous work: (1) We separate the Controlled Language (CL) evaluation as it was in Temnikova work from the MT evaluation and applies it only as MT evaluation. (2) We test the error classification and ranking method on a new (Non-Indo-European) language (Modern Standard Arabic, MSA). (3) We increase the number of annotators and textual data. (4) We test the approach on new text genres (news articles). On our way of building a classifier which would assign post-editing effort scores to new texts, we have conducted a new experiment, aiming to test whether a previously introduced approach applies also to Arabic, a language different from those for which the cognitive evaluation model was initially developed.The results of the experiment confirmed once again that Machine Translation (MT) texts of different translation quality exhibit different distributions of error categories, with the texts with lower MT quality containing more errors, and error categories which are more difficult to correct (e.g. word order errors). The results also showed some variation in the presence of certain categories of errors, which we deem being typical for Arabic. The comparison of texts of better MT quality showed similar results across all four languages (Modern Standard Arabic, Russian, Spanish, and Bulgarian), which shows that the approach can be applied without modification also to non-Indo-European languages in order to distinguish the texts of better MT quality from those of worse MT quality.In future work, we plan to adapt the error categories to Arabic (e.g., add the category “merge tokens»»), in order to test if such language-specific adaptation would lead to better results for Arabic. We plan to use a much bigger dataset and extract most of the categories automatically.We also plan to assign weights and develop a unique post-editing cognitive difficulty score for MT output texts. We are confident that this will provide a fair estimation of the cognitive effort required for post-editors to edit such texts, and will help translators to receive a fair compensation for their work.
-
-
-
A Decomposition Algorithm to Measure Redundancy in Structured Linear Systems
Authors: Vishnu Vijayaraghavan, Kiavash Kianfar, Hamid Parsaei and Yu DingNowadays, inexpensive smart devices with multiple heterogeneous on-board sensors, networked through wired or wireless links and deployable in large numbers, are distributed throughout a physical process or in a physical environment, providing real-time, and dense spatio-temporal measurements and enabling surveillance and monitoring capability that could not be imagined a decade ago. Such a system-wide deployment of sensing devices is known as distributed sensing, and is considered one of the top ten emerging technologies that will change the world. Oil and gas pipeline systems, electrical grid systems, transportation systems, environmental and ecological monitoring systems, security systems, and advanced manufacturing systems are just a few examples among many others. Malfunction of any of the large-scale systems typically results in enormous economic loss and sometimes even endangers critical infrastructure and human lives. In any of these systems, the system state variables, whose values trigger various actions, are estimated based on the measurements gathered by the sensor system that monitors and controls the system of interest. Consequently, the reliability of these estimations is of utmost importance in economic and safe operation of these large-scale systems. In a linear sensor system, the sensor measurements are combined linear responses of the system states that need to be estimated. In the engineering literature, a linear model is often used to establish connection between sensor measurements in a system and the system's state variables through the sensor system's design matrix. In such systems, the sensor outputs y and the system states x are linked by the set of linear equations represented as y = Ax+e, where y and e are n by 1 vectors, and x is a p by 1 vector. A is an n by p design matrix (n >> p) that models the linear measurement process. The matrix A is assumed to be of full column rank i.e., r(A) = p, where r(A) denotes the rank of A. The last term e is a random noise vector, which is assumed to be normally distributed with mean 0. In the context of estimation reliability, the redundancy degree in a sensor system is the minimum number of sensor failures (or measurement outliers) which can happen before the identifiability of any state is compromised. This number, called the degree of redundancy of the matrix A and denoted by d(A), is formally defined as d(A) = {d-1: there esists A[-d] s.t. r(A[-d]) < r(A)}, where A[ − d] is the reduced matrix after deleting some d rows from the original matrix. The degree of redundancy of linear sensor systems is a measure of robustness of the system against sensor failures and hence the reliability of a linear sensor system. Finding the degree of redundancy for structured linear systems is proven to be NP-hard. Bound and decompose, mixed integer programming, l1-minimization methods have all been studied and compared in the literature. But none of these methods are suitable for finding the degree of redundancy in large scale sensor systems. We propose a decomposition approach which effectively disintegrates the problem into a reasonable number of smaller subproblems utilizing the structure inherent in such linear systems using concepts of duality and connectivity from matroid theory. We propose two different but related algorithms, both of which solves the same redundancy degree problem. While the former algorithm applies the decomposition technique over the vector matroid (the design matrix), the latter uses its corresponding dual matroid. These subproblems are then solved using mixed integer programming to evaluate the degree of redundancy for the whole sensor system. We report substantial computational gains (up to 10 times) for both these algorithms as compared to even the best known existing algorithms.
-
-
-
Framework for Visualizing Browsing Patterns Captured in Computer Logs
Authors: Noora Fetais and Rachael FernandezResearch ProblemAn Intrusion Detection System (IDS) is used for preventing security breaches by monitoring and analyzing the data recorded in log files. An IDS analyst is responsible for detecting intrusions in a system by manually investigating the vast amounts of textual information captured in these logs. The activities that are performed by the analyst can be split into 3 phases, namely: i) Monitoring ii) Analysis and iii) Response [1]. The analyst starts by monitoring the system, application and network logs to find attacks against the system. If an abnormality is observed, the analyst moves to the analysis phase in which he tries to diagnose the attacks by analyzing the users’ activity pattern. After the reason has been diagnosed, appropriate steps are taken to resolve the attacks in the response phase. The analyst's job is time-consuming and inevitably prone to errors due to the large amount of textual information that has to be analyzed [2]. Though there have been various frameworks for visualizing information, there hasn't been much research aimed at visualizing the events that are captured in the log files. Komlodi et al. (2004) proposed a popular framework which is enriched with a good set of requirements for visualizing the intrusions in an IDS. However, they do not provide any details for handling the data in the logs which is essentially the source of data for an IDS, nor do they provide any tasks for predicting an attack. It has also been identified that current IV systems tend to place more importance on the monitoring phase over the other two equally important phases. Hence, a framework that can tackle this problem should be developed. Proposed Framework We propose a framework for developing an IDS which works by monitoring the log files. The framework provides users with a set of parameters that have to be decided before developing the IDS and supports the classification of activities in the network into 3 types, namely: Attack, Suspicious and Not Attack. It also provides phase-specific visualization tasks, and other tasks that are required for extracting information from log files and those that limit the size of the logs. We also outline the working of a Log Agent that is responsible for collecting information from different log files and then summarizing them into one master log file [3]. The proposed framework is applied on a simple file portal system that keeps track of users who access/delete/modify an existing file or add new files.The master log file captures the browsing patterns of the users that use the file portal. This data is then visualized to monitor every activity in the network. Each activity is visualized as a pixel whose attributes describe whether it is an authorized activity or an illegal attempt to access the system. In the analysis phase, tasks that help to determine a potential attack and the reasoning behind the classification of an activity as Suspicious or Attack are provided. Finally, in the response phase, tasks that can resolve the attack and tasks for reporting the details of the attack for future analysis are provided. References [1] A. Komlodi, J. Goodall, and W. Lutters, “131 An information visualization framework for intrusion detection,” CHI’04 Extended Abstracts on..., pp. 1743– 1746, 2004. [Online]. Available: http://dl.acm.org/citation.cfm?id = 1062935 [2] R. Fernandez and N. Fetais, “Framework for Visualizing Browsing Patterns Captured in Computer Logs Using Data Mining Techniques,” International Journal of Computing & Information Sciences, vol. 12, no. 1, pp. 83–87, 2016. [3] H. Kato, H. Hiraishi, and F. Mizoguchi, “Log summarizing agent for web access data using data mining techniques 2. Approach for web access log mining 3 Analysis of web access log 4. Design of Log Analysis System,” Analysis, vol. 00, no. C, pp. 2642–2647.
-
-
-
Automated Service Delivery and Optimal Placement for CRANs
Authors: Aiman Erbad, Deval Bhamare, Raj Jain and Mohammed SamakaTraditionally, in cellular networks, users communicate with the base station that serves the particular cell under coverage. The main functions of a base station can be divided into two, which are the baseband unit (BBU) functionalities and the remote radio head (RRH) functionalities. The RRH module is responsible for digital processing, frequency filtering and power amplification. The main sub-functions of the baseband processing module are coding, modulation, Fast Fourier Transform (FFT) and others. Data generally flows from RRH to BBU for further processing. Such BBU functionalities may be shifted to the cloud based resource pool, called as the Cloud-Radio Access Network (C-RAN) to be shared by multiple RRHs. Advancements in the field of cloud computing, software defined networking and virtualization technology may be leveraged by operators for the deployment of their BBU services, reducing the total cost of deployment. Recently, there has been a trend to collocate the baseband unit (BBU) functionalities and services from multiple cellular base stations into a centralized BBU pool for the statistical multiplexing gains. The technology is known as Cloud Radio Access Network (C-RAN). C-RAN is a novel mobile network architecture that can address a number of challenges the mobile operators face while trying to support the growing end users’ needs. The idea is to virtualize the BBU pools, which can are shared by different cellular network operators, allowing them to rent radio access network (RAN) as a cloud service. However, the manual configuration of the BBU services over the virtualized infrastructure may be inefficient and error-prone with the increasing mobile traffic. Similarly, in centralized BBU pools, non-optimal placement of the Virtual Functions (VFs) might result in a high deployment cost as well as long delays to the end-users. This may mitigate the advantages of this novel technology platform. Hence, the optimized placement of these VFs is necessary to reduce the total delays as well as minimize the overall cost to operate the C-RANs. Despite great advantages provided by the C-RAN architecture, there is no explicit support for the mobile operators to deploy their BBU services over the virtualized infrastructure, which may lead to the ad-hoc and error-prone service deployment in the BBU pools. Given the importance of C-RANs and yet the ad-hoc nature of their deployment, there is a need of automated and optimal application delivery in the context of cloud-based radio access networks to fully leverage the cloud computing opportunities in the Internet. In this work, we propose development of a novel automated service deployment platform, which will help to automate the instantiation of virtual machines at the cloud as user demands vary to achieve end-to-end automation in service delivery for C-RANs. Also, we consider the problem of optimal VF placement over distributed virtual resources spread across multiple clouds, creating a centralized BBU cloud. The aim is to minimize the total response time to the base stations in the network, as well as to satisfy the cost and capacity constraints. In this work, we implement an enhanced version of the two common approaches in the literature, which are: (1) branch-and-bound (BnB) and (2) Simulated Annealing (SA). The enhancement reduces the execution complexity of the BnB heuristic so that the allocation is faster. The proposed enhancements also improve the quality of the solution significantly. We compare the results of the standard BnB and SA schemes with the enhanced approaches to demonstrate these claims. Our aim was to develop a faster solution which can meet the latency requirements of the C-RANs, while the performance (here, in terms of cost and latency) is not far from the optimal. The proposed work contributes to “Information & Computing Technology” pillar of ARC’18. Also, it contributes to Qatar National Vision 2030 that encourages ICT initiatives. This vision, envisages Qatar at the forefront of the latest revolutions in computing, networking, Internet, and Mobility. Mobile applications form the majority of business applications on the Internet. This research proposal addresses the latest research issues in proliferation of the novel technology such as 5G. This project is timely since there is limited research, in Qatar (as well as globally) on supporting application delivery in general in the context of multiple heterogeneous cloud-based application deployment environments.
-
-
-
Does Cultural Affinity Influence Visual Attention An Eye Tracking Study of Human Images in ECommerce Websites
More LessThe objective of this research is to better understand the influence of cultural affinity on the design of Arab e-commerce websites, specifically the use of human images. Three versions of an e-commerce website selling cameras were built for this study – one with the image of an Arab woman holding a camera, another with the image of a Western woman holding a camera, and a third with no human image. All three websites displayed the same products (cameras) and contained the same navigational and textual elements. An eye tracking experiment involving 45 Arab participants showed that the image of the Arab woman gained participants' visual attention faster and for a longer duration than the image of the Western woman. When participants were presented with all three websites, 64.5% expressed their preference to purchase from the website with an Arab image. Although not reported in detail here, a structured questionnaire was also administered to study the influence of cultural affinity on perceived social presence and image appeal. A post-interview yielded further insights into participant preferences and the selection of culture-specific content for designing Arab ecommerce websites.
-
-
-
The political influence of the Internet in the Gulf
More LessI am a current student researching the findings from the QNRF NPRP grant, “Media Use in the Middle East” (NPRP 7-1757-5-261), a seven-nation survey by Northwestern University in Qatar. I am particularly interested in the potential of the Internet to increase feelings of political efficacy among citizens in the Arab region. The Internet has been shown to create feelings of political efficacy in many instances around the world, like how social media accelerated Egypt's 2011 (Gustin, S, 2011), but it can also create feelings of disempowerment (Bibri, S, E, 2015). I am interested in this topic specifically because, with the lack of freedom of expression in the Gulf region, many are turning to the internet to share their opinion on their country's political matter. Although there are consequences to those who criticize Gulf governments, the growing significance of the internet has become a communication tool between the country and its people. In my research, I look only at nationals in the Gulf countries of Qatar, Saudi Arabia, and the United Arab Emirates. Although the survey covers expatriate residents as well, I choose to use data on nationals only due to the personal nature of the question. Expatriates living in one of these Gulf countries might answer the question with their own country in mind or they might answer with their residence country in mind. How can we analyze the meaning of their responses without knowing exactly what they were thinking about? However, we know that nationals of specific countries will be thinking of whether they have increased political efficacy in their own countries, making the analysis clearer. Also, nationals on the Internet would most likely have a bigger political influence on their country than expatriates, and officials would probably prioritize their opinions over expatriates, which also justifies the look at only nationals. I investigate feelings of political efficacy through the Internet by focusing on a set of two questions. The question begins with, “Do you think by using the Internet…” and then asks the respondents two different statements: “…People like you can have more political influence?” and “…Public officials will care more what people like you think?” The response options were a scale of 1 to 5 where 1 means strongly disagree and 5 means strongly agree. These two statements probe different, but important areas of political efficacy. They probe the freedom of expression in the Gulf and whether these expressions are being heard and acted upon by the government, both in terms of citizen perceptions of political influence and emotional connection. My initial research demonstrates differences in levels of political efficacy across the three countries, including over time, and also within different demographics. The overall results show Saudi Arabia (58%) with the most belief in the political efficacy of the Internet, followed by Qatar (36%), then the UAE (18%). For Saudi, the majority of the sample believe they have a political influence on the internet, whereas in Qatar and the UAE, less than half believe so. This demonstrates a significant difference between the nations, which will need further investigating. We also see some interesting changes over time. Overall, more people in 2017 believe that officials care about what's being said on social media than in 2015. In Saudi, it moved up 15%, whereas in the UAE, there was only a 4% increase. Since this particular question was asked differently in Qatar in 2015, the comparison data wasn't given. However, the overall showings implicate that Gulf governments have begun to respond to the public's political input on the Internet. This will further be explored through interviews with nationals from the three countries. I've also found that there are similarities among the three countries as well. By looking at the results by gender, I find that more men (for example, in Saudi, 62%) than women (54%) believe they have more political influence by using the Internet. Also, when looking at the results by cultural conservativism or cultural progressivism, more progressives than conservatives believe they have political influence (Saudi: 65% vs. 53%) and that the government cares more about what people like them think when using the Internet (Saudi: 61% vs. 59%). My next stage of research is to investigate the number of issues about free speech and political efficacy in the Middle East that I have begun to uncover through analysis of the survey data. The main story I want to explain is the belief that the use of the Internet increases political influence in this region. I am particularly interested in investigating questions about gender inequality on social media, the extent to which the public is able to talk about important issues facing their countries on the Internet and if they feel they are being heard. I am also interested in learning more about the perceptions of conservative and progressive people on free speech and politics and how these perceptions have changed over the years. My analysis will be based on data from the survey as well as interviews conducted by myself of citizens from the three countries, which will give context to the survey data. Work cited: Bibri, S. E. (2015) The Shaping of Ambient Intellience and the Internet of Things. Norway: Atlantis Press. Dennis, E. E., Martin, J. D., & Wood, R. (2017). Media use in the Middle East, 2017: A seven-nation survey. Northwestern University in Qatar. Retrieved from www.mideastmedia.org/survey/2017. Gustin, S. Social Media Sparked, Accelerated Egypt's Revolutionary Fire. (2011, Nov 2), Wired. Retrieved from https://www.wired.com/2011/02/egypts-revolutionary-fire/
-
-
-
A Cheremebased Sign Language Description System
Authors: Abdelhadi Soudi and Corinne VinopolBecause sign languages are not written, it is challenging to describe signs without knowing a spoken language equivalent. Sign language does not represent in any direct way the form of the spoken language either by visually representing sounds or syntactic sequences of words of the spoken language. One sign may mean an entire Arabic phrase and vice versa. Sign language can only be described, animated or videotaped. For example, a Deaf person may have it difficult to convey a sign for a concept such as “moon” on paper if he/she does not know the Arabic word. In this paper, we describe a notation system enables users (among other functions) to choose. Sign Language users identify the four cheremes for each hand for the STEM sign for which they want to find Standard Arabic equivalents by using pictorial lists. Using the four descriptors (hand shape, movement, location, and palm orientations of both hands, Deaf and Hard-of-Hearing users can describe Moroccan Sign Language (MSL) signs and find corresponding Arabic STEM terms, MSL and Arabic definitions, and concept pictures. The program will then search the database for the sign that most closely matches the selected cheremes in the STEM Sign Database. Then Standard Arabic information (definitions, parts of speech, etc.) and MSL information (graphic signs and videos) are displayed to the user. There are two possible scenarios that occur: if the system finds an exact match of the selected cheremes, then it will return the exact sign with Standard Arabic and MSL information; otherwise, it will display signs that most closely match the selected cheremes. In our database, signs and words have an N-N relationship, which means that there are signs that refer to multiple Standard Arabic words and vice-versa. Therefore, we had to group the database by signs and try to select signs with more than one Standard Arabic equivalent. Our database is in alphabetical order by Arabic base form word/stem. This means that when there is an Arabic word that has more than one meaning and consequently different signs, there are separate entries for that word. In order to reverse the strategy, that is, identify signs that can be expressed as different Arabic words and invariably have different meanings, we have reordered the database according to sign graphic file name. By programming retrieval using this reverse strategy, we have created the first-ever digital MSL thesaurus.The creation of this resource required, among other things, the identification and development of codes for the MSL cheremes, code assignment to the STEM signs and addition of Arabic definitions and videotapes of MSL translations of definitions. A usability and feasibility evaluation of the tool was conducted by having educators of deaf children, their parents, and deaf children themselves test the software.
-
-
-
Effectiveness of Driver Simulator as a Driving Education Tool
Authors: Semira Omer Mohammed, Wael Alhajyaseen, Rafaat Zriak and Mohammad KhorasaniThe impact and validity of driving simulators as an educational tool in driving schools in the licensing process remains questioned in literature. Many driving schools utilize driving simulators as a tool to assist students to learn required skills faster. Few studies showed conflicting results whether utilizing driving simulator is effective in improving the quality of driving education process. The applications of driving simulators are not limited to the driver training and education. It can assist in identifying risky drivers for further training to improve risk perception. Driver training has two key aspects: vehicle control and safety knowledge. However, it is common for training courses to focus on vehicle control and priority rules while giving lower attention to safety and risk identification skills. In this regard, driving simulators can play an important role by providing an artificial environment in which students can experience potential risks while driving. In Qatar, the training process to get licensed typically covers basics of vehicle control and driving laws (road signs etc.). Advanced training courses such as defensive driving are also available for those who completed the normal training process and successfully received their license. Such advanced courses are usually limited to companies who require such training to their employees. This paper aims to investigate the effectiveness of driving simulators in driving education. A driving school in the State of Qatar utilizes advanced simulators in its training programme. This study looks at students who go through both simulator and non-simulator training tracks in the driving school. Novice students begin with a 10-hour theory course which mainly focuses on road signs and markings. Following their theory course, the students are required to complete a sign test. This is followed by 5 simulator training sessions of 20-minutes each for those registered in simulator track. The first session is for simulator adaptation and familiarization with the vehicle's controls. The student is required to drive slowly around a simple oval road; a few cars are added at the end of the session. The second session uses a more complex road network with intersections and roundabouts. The last three sessions use a virtual replica of a section of Doha. The third session is conducted with no cars on the road. Traffic is added in the fourth session. In the fifth simulation session, surrounding vehicles are designed to run with unexpected or even aggressive behaviors such as sudden lane changes, speeding and not giving right of way. At the end of the fifth session, the student is issued a performance report. After the simulator sessions, students start with the 40 hours of on the road training. The student is then required to do a parking test followed by a road test. The students are permitted to take their road tests after 20 hours of on the road training. Each student is allowed to fail up to 2 road tests before they are required to sign up for further courses. A random sample of student data was collected from both simulator training and non-simulator training tracks. All the students were first time learners with no previous license and who passed their road tests. Data collected include gender, age, nationality and number of road tests undertaken before they passed. The study aims to determine whether any of the collected variables have significant effect on the number of road tests attempted and passing driving test on first attempt. The factors tested are gender, ethnicity, age and whether on simulator or non-simulator lesson track. Furthermore, the study attempted to formulate a model that can predict the likelihood of passing the driving test on first attempt. This pilot study is expected to clarify the effectiveness of driving simulators as an educational tool and whether their utilization is justifiable.Acknowledgment: This publication was made possible by the NPRP award [NPRP 9-360-2-150] from the Qatar National Research Fund (a member of The Qatar Foundation). The statements made herein are solely the responsibility of the author[s].
-
-
-
ThreatBased Security Risk Evaluation in the Cloud
Authors: Armstrong Nhlabatsi, Khaled Khan, Noora Fetais, Rachael Fernandez, Jin Hong and Dong Seong KimResearch ProblemCyber attacks are targeting the cloud computing systems, where enterprises, governments, and individuals are outsourcing their storage and computational resources for improved scalability and dynamic management of their data. However, the different types of cyber attacks, as well as the different attack goals, create difficulties providing the right security solution needed. This is because different cyber attacks are associated with different threats in the cloud computing systems, where the importance of threats varies based on the cloud user requirements. For example, a hospital patient record system may prioritize the security of cyber attacks tampering patient records, while a media storage system may prioritize the security of cyber attacks carrying out a denial of service attack for ensuring a high availability. As a result, it is of paramount importance to analyze the risk associated with the cloud computing systems taking into account the importance of threats based on different cloud user requirements.However, the current risk evaluation approaches focus on evaluating the risk associated with the asset, rather than the risk associated with different types of threats. Such a holistic approach to risk evaluation does not show explicitly how different types of threats contribute to the overall risk of the cloud computing systems. Consequently, This makes it difficult for security administrators to make fine-grained decisions in order to select security solutions based on different importance of threats given the cloud user requirements. Therefore, it is necessary to analyze the risk of the cloud computing systems taking into account the different importance of threats, which enables the allocation of resources to reduce particular threats, identify the risk associated with different threats imposed, and identify different threats associated with cloud components.Proposed SolutionThe STRIDE threat modeling framework (short for STRIDE) is proposed by Microsoft, which can be used for threat categorization. Using the STRIDE, we propose a threat-guided risk evaluation approach for the cloud computing systems, which can evaluate the risk associated with each threat category from the STRIDE explicitly. Further, we utilize seven different types of security metrics to evaluate the risk namely: \textit{component, component-threat, threat-category, snapshot, path-components, path-threat}, and \textit{overall asset}. Component, component-threat, threat-cateory, and snapshot risks measure the total risk on a component, component risk for a particular threat category, total snapshot risk for a single threat, and the total risk of the snapshot considering all threat categories, respectively. Path-components, path-threat, and overall asset measure the total risk of components in an attack path, the risk of a single threat category in the attack path, and the overal risk to an asset considering all attack paths, respectively. These metrics makes it possible to measure the contribution of each threat category to the overall risk more precisely.When a vulnerability is discovered in a component (e.g. a Virtual Machine) of the Cloud deployment, the administrator first determines which types of threats could be posed should the vulnerability be successfully exploited, and what would be the impacts of each of those threats on the asset. The impact assignment of each threat type is weighted depending on the importance of the component. For example, a Virtual Machine (VM) that acts a Web Server in a medical records management application could be assigned a higher weighting for \textit{denial-of-service} threats because if such attacks are successfully launched then the rest of the VMs that are reached through the Web Server will be unavailable. On the other hand, a vulnerability discovered in a VM that hosts a database of medical records would be rated highest impact for \textit{information disclosure} because if it is compromised confidentiality of the medical history of patients will be violated.By multiplying the probability of successfully exploiting the vulnerability with the threat impact, we compute the risk of each threat type. The variation in the assignment of impact for different threat types enables our approach to compute risks associated with the threats - thus empowering the security administrator with the ability to make fine-grained decisions on how much resources to allocate for mitigating which type of threat and which threats to prioritize. We evaluated the usefulness of our approach through its application to attack scenarios in an example Cloud deployment. Our results show that it is more effective and informative to administrators compared to asset-based approaches to risk evaluation.
-
-
-
Compressive SensingBased Remote Monitoring Systems for IoT applications
Authors: Hamza Djelouat, MOHAMED Al Disi, Abbes Amira and Faycal BensaaliInternet of things (IoT) is shifting the healthcare delivery paradigm from in-person encounters between patients and providers to an «anytime, anywhere» model delivery. Connected health has become more profound than ever due to the availability of wireless wearable sensors, reliable communication protocols and storage infrastructures. Wearable sensors would offer various insights regarding the patient's health (electrocardiogram (ECG), electroencephalography (EEG), blood pressure, etc.) and their daily activities (hours slept, step counts, stress maps,) which can be used to provide a thorough diagnosis and alert healthcare providers to medical emergencies. Remote elderly monitoring system (REMS) is the most popular sector of connected health, due to the spread of chronic diseases amongst the older generation. Current REMS use low power sensors to continuously collect patient's records and feed them to a local computing unit in order to perform real-time processing and analysis. Afterward, the local processing unit, which acts as a gateway, feeds the data and the analysis report to a cloud server for further analysis. Finally, healthcare providers can then access the data, visualize it and provide the proper medical assistant if necessary. Nevertheless, the state-of-the-art IoT-based REMS still face some limitations in terms of high energy consumption due to raw data streaming. The high energy consumption decreases the sensor's lifespan immensely, hence, a severe degradation in the overall performance of the REMS platform. Therefore, sophisticated signal acquisition and analysis methods, such as compressed sensing (CS), should be incorporated. CS is an emerging sampling/compression theory, which guarantees that an N-length sparse signals can be recovered from M-length measurement vector (M<<N) using efficient algorithms such as convex relaxation approaches and greedy algorithms. This work aims to enable two different scenarios for REMS by leveraging the concept of CS in order to reduce the number of samples transmitted form the sensors while maintaining a high quality of service. The first one is dedicated to abnormal heart beat detection, in which, ECG data from different patients is collected, transmitted and analysed to identify any type of arrhythmia or irregular abnormalities in the ECG. The second one aims to develop an automatic fall detection platform in order to detect falls occurrence, their strength, their direction in order to raise alert and provide prompt assistance and adequate medical treatment. In both applications, CS is explored to reduce the number of transmitted samples form the sensors, hence, increase the sensors lifespan. In addition, the identification and the detection is enabled by means of machine learning and pattern recognition algorithms. In order to quantify the performance of the system, subspace pursuit (SP) has been adopted as recovery algorithm. Whereas for data identification and classification, K-nearest neighbour (KNN), E-nearest neighbour (ENN), decision tree (BDT) and committee machine (CM) have been adopted.
-
-
-
Applied Internet of Things IoT: Car monitoring system for Modeling of Road Safety and Traffic System in the State of Qatar
Authors: Rateb Jabbar, Khalifa Al-Khalifa, Mohamed Kharbeche, Wael Alhajyaseen, Mohsen Jafari and Shan JiangOne of the most interesting new approaches in the transportation research field is the Naturalistic Driver Behavior which is intended to provide insight into driver behavior during everyday trips by recording details about the driver, the vehicle and the surroundings through an unobtrusive data gathering equipment and without experimental control. In this paper, an Internet of Things solution that collects and analyzes data based on Naturalistic Driver Behavior approach is proposed. The analyzed and collected data will be used as a comprehensive review, and analysis of the existing Qatar traffic system, including traffic data infrastructure, safety planning, engineering practices and standards. Moreover, data analytics for crash prediction and the use of these predictions for the purpose of systemic and systematic network hotspot analysis, risk-based characterization of roadways, intersections, and roundabouts are developed. Finally, an integrated safety risk solution was proposed. This latter, enables decision makers and stakeholders (road users, state agencies, and law enforcement) to identify both high-risk locations and behaviors by measuring a set of dynamic variables including event-based data, roadway conditions, and driving maneuvers. More specifically, the solution consists of a driver behaviors detector system that uses mobile technologies. The system can detect and analyze several behaviors like drowsiness and yawning. Previous works are based on detecting and extracting facial landmarks from images. However, the new suggested system is based on a hybrid approach to detect driver behavior utilizing a deep learning technique using a multilayer perception classifier. In addition, this solution can also collect data about every day trips like start time, end time, average speed, maximum speed, distance and minimum speed. Furthermore, it detects for every fifteen seconds measurements like GPS position, distance, acceleration and rotational velocity along the Roll, Pitch and Yaw axes. The main advantage of the solution is to reduce safety risks on the roads while optimizing safety mitigation costs to a society. The proposed solution has three-layer architecture, namely, the perception, network, and application layers as detailed below. I. The perception layer is the physical layer, composed from several Internet of Thing devices that uses mainly use the smart phones equipped with cameras and sensors (Magnetometer, Accelerometers Gyroscope and Thermometer, GPS sensor and Orientation sensor) for sensing and gathering information about the driver behavior roads and environment as shown in Fig. 1. II. The network layer is responsible for establishing the connection with the servers. Its features are also used for transmitting and processing sensor data. In this solution, hybrid system that collect data and store them locally before sending them to the server is used. This technique proves its efficiency in case of Poor Internet coverage and unstable Internet connection. III. The application layer is responsible for delivering application specific services to end user. It consists in sending the data collected to web server in order to be treat and analyzed before displaying it to the final end user. The web service which part of the application layer is the component responsible for collecting data not only from devices but also from other sources such General Traffic Directorate at Minister of Interior to gather the crash details. This web service stocks all stored data in database server and analyses them. Then, the stored data and analysis will be available for end user via website that has direct access to the web services. Figure 1: Architecture of Car monitoring system Keywords: Driver Monitoring System, DrowsinessDetection, Deep Learning, Real-time Deep Neural Network, Fig. 1: Architecture of IoT solution Keywords: Driver Monitoring System, Drowsiness Detection, Deep Learning, Real-time Deep Neural Network,
-
-
-
Importance of CapabilityDriven Requirements for Smart City Operations
Capability oriented requirements engineering is an emerging research area where designers are faced with the challenge of analyzing changes in the business domain, capturing user requirements, and developing adequate IT solutions taking into account these changes and answering user needs. In this context, researching the interplay between design-time and run-time requirements with a focus on adaptability is of great importance. Approaches to adaptation in the requirements engineering area consider issues underpinning the awareness of requirements and the evolution of requirements. We are focusing on researching the influence of capability-driven requirements on architectures for adaptable systems to be utilized in smart city operations. We investigate requirements specification, algorithms, and prototypes for smart city operations with a focus on intelligent management of transportation and on validating the proposed approaches. In this framework, we conducted a systematic literature review (SLR) of requirements engineering approaches for adaptive system (REAS). We investigated the modeling methods used, the requirements engineering activities performed, the application domains involved, and the deficiencies that need be tackled (in REAS in general, and in SCOs in particular). We aimed at providing an updated review of the state of the art in order to support researchers in understanding trends in REAS in general, and in SCOs in particular. We also focused on the study of Requirement Traceability Recovery (RTR). RTR is the process of constructing traceability links between requirements and other artifacts. It plays an important role in many parts of the software life-cycle. RTR becomes more important and exigent in the case of systems that change frequently and continually, especially adaptive systems where we need to manage the requirement changes in such systems and analyze their impact. We formulated RTR as a mono and a multi-objective search problem using a classic Genetic Algorithm (GA) and a Non-dominated Sorting-based Genetic Algorithm (NSGA-II) respectively. The mono-objective approach takes as input the software system, a set of requirements and generates as output a set of traces between the artifacts of the system and the requirements introduced in the input. This is done based on the textual similarity between the description of the requirements and the artifacts (name of code elements, documentation, comments, etc.). The multi-objective approach takes into account three objectives, namely, the recency of change, the frequency of change, and the semantic similarity between the description of the requirement and the artifact. To validate the two approaches, we used three different open source projects. The reported results confirmed the effectiveness of the two approaches in correctly generating the traces between the requirements and artifacts with high precision and a recall. A comparison between the two approaches shows that the multi-objective approach is more effective than the mono-objective one. We also proposed an approach aiming at optimizing service composition in service-oriented architectures in terms of security goals and cost using NSGA-II in order to help software engineers to map the optimized service composition to the business process model based on security and cost. To do this, we adapted the DREAD model for security risk assessment by suggesting new categorizations for calculating DREAD factors based on a proposed service structure and service attributes. To validate the proposal, we implemented the YAFA-SOA Optimizer. The evaluation of this optimizer shows that risk severity for the generated service composition is less than 0.5, which matches the validation results obtained from a security expert. We also investigated requirements modeling for an event with a large crowd using the capability-oriented paradigm. The motivation was the need for the design of services that meet the challenges of alignment, agility, and sustainability in relation to dynamically changing enterprise requirements especially in large-scale events such as sports events. We introduced the challenges to stakeholders involved in this process and advocated a capability-oriented approach for successfully addressing these challenges. We also investigated a multi-type, proactive and context-aware recommender system in the environment of smart cities. The recommender system recommends gas stations, restaurants, and attractions, proactively, in an internet of things environment. We used a neural network to do the reasoning and validated the system on 7000 random contexts. The results are promising. We also conducted a user's acceptance survey (on 50 users) that showed satisfaction with the application. We also investigated capturing uncertainty in adaptive intelligent transportation systems, which need to monitor their environment at run-time and adapt their behavior in response to changes in this environment. We modelled an intelligent transportation case study using the KAOS goal model and modelled uncertainty by extending our case study using variability points, and hence having different alternatives to choose from depending on the context at run-time. We handled uncertainty by molding our alternatives using ontologies and reasoning to select the optimal alternative at run-time when uncertainty occurs. We also devised a framework, called Vehicell that exploits 5G mobile communication infrastructures to increase the effectiveness of vehicular communications and enhance the relevant services and applications offered in urban environments. This may help in solving some of the mobility problems, and smooth the way for innovative services to citizens and visitors and improve the overall quality of life.
-
-
-
Secure RF Energy Harvesting Scheme for Future Wireless Networks
More LessWireless communication is shaping the future of seamless and reliable connectivity of billions of devices. The communication sector in Qatar is mainly driven by rising demand of higher data rates and uninterrupted connectivity of wireless devices. Wireless fidelity (Wi-Fi), cellular telephone and computer interface devices (e.g. Bluetooth) are a few of the commonly used applications for wireless distribution of information in Qatar. According to analysts, strong growth of Islamic banking, increase in IT consolidation and increased adoption of mobility solutions are some of the key contributors to the growth of digital infrastructure in Qatar. Modernization of legacy infrastructure is another focal point of government of Qatar to enable e-government initiative in rural areas and Tier II/ III cities of Qatar which has long term effects in various domains such as health, electricity, water, heat, communication and trade. Considering this exponential rise of wireless communication in Qatar, a great deal of research is being done from the perspective of secure deployment of new wireless networks. There is also a growing demand to develop more energy-efficient communication techniques to reduce the consumption of fossil fuel without compromising the quality of experience of users. This is also beneficial for solving economic issues that cellular operators are faced with the ever growing number of users. Moreover, with the upcoming FIFA world cup in 2022, millions of dollars are being spent to enhance the capacity and security of existing and upcoming communication networks. However, with greater connectivity and ultimate functionality come several important challenges. The first challenge is the security of data or more specifically who has access to the data. The broadcast nature of wireless channels implies that the transmitted information signals are also received by nodes other than the intended receiver, which results in the leakage of information. Encryption techniques at higher layers are used to secure transmitted information. However, the high computational complexity of these cryptographic techniques consumes significant amount of energy. Moreover, secure secret key management and distribution via an authenticated third party is typically required for these techniques, which may not be realizable in a dense wireless networks. Therefore, a considerable amount of work has recently been devoted to information-theoretic physical layer security (PLS) as a secure communication technique which exploits the characteristics of wireless channels, such as fading, noise, and interferences. The varying nature of these factors causes randomness in the wireless channel which can be exploited to achieve security. The transmission of secret messages takes place when the receiver»s channel experiences less fading than the eavesdropper»s channel, otherwise transmission remains suspended. The second concern is regarding the limited lifetime of wireless devices, especially when massive amount of data needs to be collected and transferred across the network. This challenge can be addressed by innovative means of powering, and for small energy limited devices, this implies the use of energy harvesting (EH) techniques. In this context, the transfer of data and power over a common electromagnetic (EM) wave has gained significant research interest over the past decade. The technique which merges wireless information transfer (WIT) with wireless power transmission (WPT) is commonly termed as simultaneous wireless information and power transfer (SWIPT). However, SWIPT systems cannot be supported using conventional design of transmitter and receiver. To address this issue, two broad categories of receiver architectures have been proposed in SWIPT literature i.e. separated and integrated architecture. In separated receiver architecture, the information decoder and energy harvester act as dedicated and separate units. This although not only increases the cost of receiver but also increases the complexity of the hardware. In contrast, the integrated receiver architecture jointly processes the information and energy using a unified circuitry for both. This architecture reduces the cost and hardware complexity Our work attempts to address the aforementioned issues by evaluating secrecy performance and proposing practical secrecy enhancement scheme in EH wireless devices. In particular, we investigate PLS in SWIPT systems in the presence of multiple eavesdroppers. The secrecy performance of the SWIPT system is analyzed for Rician faded communication links. The security performance is analyzed for imperfect channel estimation, and both separated and integrated receiver architectures for the SWIPT system. We derive closed-form expressions of the secrecy outage probability and the ergodic secrecy rate for the considered scenario and validate the derived analytical expressions through extensive simulations. Our results reveal that an error floor appears due to channel estimation errors at high values of signal to noise ratio (SNR); such that outage probability cannot be further minimized despite an increase in the SNR of the main link. Moreover, the results show that largest secrecy rate can be achieved when the legitimate receiver is equipped with separated SWIPT receiver architecture and the eavesdroppers have an integrated SWIPT receiver architecture. It is also demonstrated that the power splitting factor at both legitimate receiver and at eavesdroppers play a prominent role in determining the secrecy performance of SWIPT. We prove that a larger power splitting factor is required to ensure link security for poor channel estimation. Finally, our work discusses transmit antenna selection and baseline antenna selection schemes to improve security. Therein, it is shown that transmit antenna selection outperforms baseline antenna selection. The results provided in this work can be readily used to evaluate the secrecy performance of SWIPT systems operating in the presence of multiple eavesdroppers.
-
-
-
Implementing and Analyzing a Recursive Technique for Building Path Oblivious RAM
Authors: Maan Haj Rachid, Ryan Riley and Qutaibah MalluhiIt has been demonstrated that encrypting confidential data before storing it is not sufficient because data access patterns can leak significant information about the data itself (Goldreich & Ostrovsky, 1996). Oblivious RAM (ORAM) schemes exist in order to protect the access pattern of data in a data-store. Under an ORAM algorithm, a client accesses a data store in such a way that does not reveal which item it is interested in. This is typically accomplished by accessing multiple items each access and periodically reshuffling some, or all, of the data on the data-store. One critical limitation of ORAM techniques is the need to have large storage capacity on the client, which is typically a weak device. In this work, we utilize an ORAM technique that adapts itself to working with clients having very limited storage. A trivial implementation for an oblivious RAM scans the entire memory for each actual memory access. This scheme is called linear ORAM. Goldreich and Ostrovsky (Goldreich & Ostrovsky, 1996) presented two ORAM constructions with a hierarchical layered structure: the first, Square-root ORAM, provides square root access complexity and constant space requirement; the second, Hierarchical ORAM, requires logarithmic space and polylogarithmic access complexity. Square-root ORAM was revisited by (Zahur, et al.) to improve its performance in a multi-party secure computation setting. The work of Shi et al. (Shi, Chan, Stefanov, & Li, 2011) adopted a sequence of binary trees as the underlying structure. (Stefanov, et al., 2013) utilized this concept to build a simple ORAM called path ORAM. In path ORAM, every block (item) of data in the input array A is mapped to a (uniformly) random leaf in a tree (typically a binary tree) on the server. This is done using a position map stored in the client memory. Each node in the tree has exactly Z blocks which are initially dummy blocks. Each data item is stored in a node on the path extending from the leaf to which the data item was mapped, to the root. When a specific item is requested, a position map is used to point out the leaf to which the block is mapped. Then the whole path is read starting from the block's mapped leaf up to the root into a stash. We call this procedure a read path method. The stash is a space which also exists on the client. The block is mapped to a new leaf (uniformly random). The client gets the required block from the stash. We then try to evict the contents of the stash to the same path which we read from starting from the leaf to the root. We call this procedure a write path method. To transfer a block into a node in the tree: - The node that is tested should have enough space. - The node should be on the path to which the tested block in the stash is mapped. If both conditions are met, a block is evicted to the tested node. Clearly, the tree is encrypted and whenever the client reads a path into the stash, all read blocks are decrypted and the requested block is sent to the client. The client encrypts the blocks before writing them back to the path. The security proof of this ORAM type is explained in (Stefanov, et al., 2013). We assume that a position map can be fit in the client's memory. Since it requires O(N) space, that could be a problem. (Stefanov, et al., 2013) mentioned a general idea for a solution that uses another smaller ORAM O1 on the server to store the position map and stores the position map for O1 in the client's memory. We employ a recursive generalized version of this approach. If the position map is still larger than the client's capacity, a smaller ORAM O2 is built to store the position map for O1. We call these additional trees auxiliary trees. We implemented the recursive technique for Path ORAM and studied the effect of the threshold size of position map (and consequently, the number of auxiliary trees) on the performance of path ORAM. We tested our implementation on 1 million items using several threshold sizes of position map. The number of accesses is 10,000 in all tests. Our results show expected negative correlation between the time consumption and the threshold size of the position map. However, the results suggest that unless the increase in the threshold size of position map decreases the number of trees, no significant improvement in performance will be noticed. It is also clear that the initialization process which is the process of building the items» tree and the auxiliary trees and filling the initial values comprises more than 98% of the consumed time. Accordingly, this type of ORAM suits the case of large number of accesses since the server can fulfil client»s request very fast after finishing the initialization process. References Goldreich, O., & Ostrovsky, R. (1996). Software protection and simulation on oblivious rams. Journal of the ACM (JACM), vol. 43, no. 3, pp. 431–473. Shi, E., Chan, T.-H., Stefanov, E., & Li, M. (2011). Oblivious ram with o ((logn) 3) worst-case cost. International Conference on The Theory and Application of Cryptology and Information Security. Springer, pp. 197–214. Stefanov, E., Dijk, M. V., Shi, E., Fletcher, C., Ren, L., Yu, X., et al. (2013). Path oram: an extremely simple oblivious ram protocol. Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. ACM, pp. 299–310.. Zahur, S., Wang, X., Raykova, M., Gascon, A., Doerner, J., Evans, D., et al. (n.d.). Revisiting square-root oram: Efficient random access in multi-party computation. 2016: Security and Privacy (SP), IEEE Symposium on. IEEE, pp. 218–234.
-
-
-
CONNECT: CONtextual NamE disCovery for blockchainbased services in the IoT
More LessThe Internet of Things is gaining momentum thanks to the provided vision of seamlessly interconnected devices. However, a unified way to discover and to interact with the surrounding smart environment is missing. As an outcome, we have been assisting to the development of heterogeneous ecosystems, where each service provider adopts its own protocol—thus preventing IoT devices from interacting when belonging to different providers. And, the same is happening again for the blockchain technology which provides a robust and trusted way to accomplish tasks—unfortunately not providing interoperability thus creating the same heterogeneous ecosystems above highlighted.In this context, the fundamental research question we address is how do we find things or services in the Internet of Things. In this paper, we propose the firstIoT discovery approach which provides an answer to the above question by exploiting hierarchical and universal multi-layered blockchains. Our approach does neither define new standards nor force service providers to change their own protocol. On the contrary, it leverages the existing and publicly available information obtained from each single blockchain to have a better knowledge of the surrounding environment. The proposed approach is detailed and discussed with the support of relevant use cases.
-
-
-
Multiple Input Multiple Output InVivo Communication for Nano Sensor at Terahertz Frequencies
Authors: Aya Fekry Abdelaziz, Ke Yang, Khalid Qaraqe, Joseph Boutros, Qammer Abbasi and Akram AlomainyThis study presents a preliminary feasibility investigation of signal propagation and antenna diversity techniques inside the human skin tissues in the frequency range (0.8-1.2)Terahertz (THz) by applying multiple input-single output (MISO) technique. THz application in in-vivo communication has received a great attention for its unique properties as non ionizing characteristics, strong interaction with water content inside the human tissues, and molecular sensitivity [1]. This study helps to evaluate the usage and the performance of MISO system for nanoscale network. The human skin tissue is represented by three main layers: stratum corneum, epidermis, and dermis. The path loss model and the channel characterization inside the human skin was investigated in [2]. The diversity gain (DG) for two different in-vivo channels resulting from the signal propagation between two transmitting antennas, located at the dermis layer, and one receiving antenna, located at epidermis layer, is calculated to evaluate the system performance. Different diversity combining techniques are applied in this study: selection combining (SC), equal-gain combing (EGC), and maximum-ratio combining (MRC). In the simulation setting in CST microwave studio, the distance between the transmitting antennas is fixed; while the effect of the distance between the receiver and the transmitters is analyzed at different frequencies. Although MIMO antenna systems are used in wireless communication to enhance data throughput, from the initial study it is predicted that might be it is not useful in in-vivo nano communication. Results demonstrates that there is a high cross correlation between the two channels. Figure 1 shows the CDF plot for the two channel with different diversity combining techniques used in the study.[1] J. M. Jornet and I. F. Akyildiz, “Channel modeling and capacity analysis for electromagnetic wireless nanonetworks in theterahertz band,” IEEE Transactions on Wireless Communications, vol. 10, no. 10, pp. 3211–3221, 2011.[2] Q. H. Abbasi, H. El Sallabi, N. Chopra, K. Yang, K. A. Qaraqe, and A. Alomainy, “Terahertz channel characterizationinside the human skin for nano-scale body-centric networks,” IEEE Transactions on Terahertz Science and Technology, vol.6, no. 3, pp. 427–434, 2016.
-
-
-
A Cyberphysical Testbed for Smart Water Networks Research Education and Development in Qatar
Smart water networks integrate sensor data, computation, control, and communication technologies to enhance system performance, reliability and consumer satisfaction. The cyber-physical systems are built from, and rely upon, the tight integration of physical elements of real-time sensors and data integration and cyber algorithmic, control computational and communication layers. A cyber-physical testbed has been developed at Texas A&M University at Qatar to simulate a real smart water network for research, education, and technology development purposes. The physical components include pipes, an automated pump-storage system, programmable logic controllers, controllable valves, a disinfectant injector, sensors, and data acquisition devices. Flow, pressure, temperature, and specific water quality parameters, such as pH and conductivity are continuously monitored by sensors, providing an operator with an up-to-date performance of the current state of the system. The pump-storage system is controlled by programmable logic controllers, and are designed to enable evaluation and enhancement of feedback and model-predictive control algorithms physically. The water tank is equipped with heating apparatus to conduct experimental studies for understanding the effect of water temperature on the fate and transport of chlorine and disinfection byproducts in the drinking water distribution of Qatar. The physical facility is integrated with a cyber-data acquisition and communications layer, and a cloud-based data storage, analytics, and visualization platform. Acquired data is stored and maintained in the format of a non-relational database on a cloud storage service. Additionally, a MongoDB server is set up to query and write data records. The analytics backend engine performs a variety of data transforms including, but not limited to, data cleansing and time series imputation and forecast. The visualization frontend provides a graphical interface to allow the operators interact with the backend engine by doing queries, plotting time series, running data analytics tasks, and generating reports. Together, this integrated physical and cyber layers unleash opportunities for education, research, development, evaluation, and commercialization of a variety of smart water networks technologies. It provides an environment that can predict leaks and pipe bursts based on real-time analytics of high-frequency pressure readings data on the cloud. This also enables developing smart pump-storage control technologies that help reducing non-revenue water loss, energy costs, and carbon emissions. The research team is also investigating harnessing the profound solar power resources available in Qatar for powering treatment plants and pumps by innovating control strategies that can handle the intermittency of such renewable power sources. Two asset management models are also developed and implemented on the testbed. (1) Performance assessment model of water distribution systems, which comprises four assessment modules for water pipelines, water accessories, water segments, and water networks. The model identifies critical factors affecting the performance of a water network and schedules pipe maintenance and replacement plans. (2) Risk assessment model for water pipelines failure, which evaluates the risks of performance and mechanical failures and applies a hierarchical fuzzy model to determine the risk using four risk factors as inputs (i.e., environmental, physical, operational, and post-failure). In addition to the research and technology purposes, this testbed has also provided a valuable learning resource for both operators and students. There are already several undergraduate students who are involved in design and construction of this facility. This has created an opportunity to train, educate, and empower undergraduate students, the future engineers and industry leaders of Qatar.
-
-
-
Realtime Object Detection on Android using Tensorflow
More LessDetection of images or moving objects have been highly worked upon, and has been integrated and used in commercial, residential and industrial environments. But, most of the strategies and techniques have heavy limitations. One of the limitations is due to low computational resources at user level. Other important limitations that need to be tackled are lack of proper data analysis of the measured trained data, dependency on the motion of the objects, inability to differentiate one object from other, and also concern over speed of the object under detection and Illuminacy. Hence, there is a need to draft, apply and recognize new techniques of detection that tackle the existing limitations. In our project we have worked upon a model based on Scalable Object Detection, using Deep Neural Networks to localize and track people, cars, potted plants and 16 others categories in the camera preview in real-time. The large Visual Recognition ImageNet package ‘inception5h’ from google is used. This is a trained model, with images of the respective categories, which is then converted to a graph file using neural networks. The graph nodes are usually huge in number and these are optimized for the use in android. The use of already available trained model is just for the purpose of ease and convenience, nevertheless any set of images can be trained and used in the android application. An important point to note is that training the images will need huge computational speeds and more than one computer supporting GPU. Also a.jar file built with the help of bazel is added to android studio to support the integration of Java and tensorflow. This jar file is the key to getting tensorflow in a mobile device. The jar file is built with the help of openCV, which is a library of programming functions mainly aimed at real-time computer vision. Once this has been integrated, any input to the android application inputed at real-time, is predicted with the help of tiny-yolo (you look only once – a darknet reference network). This application supports multi object detection, which is very useful. All the steps occur simultaneously with great speeds giving remarkable results, detecting all the categories of the trained model within a good illuminacy. The real time detection reference network used also works with an acceptable acceleration of moving objects but is not quite effective with low illumination. The objects are limited to identify only 20 categories but the scope can be broadened with a revised trained model. The 20 categories include «aeroplane»,»bicycle»,»bird»,»boat», «bottle», «bus», «car», «cat», «chair», «cow»,»diningtable»,»dog»,»horse»,»motorbike»,»person», «pottedplant», «sheep»,»sofa»,»train» and «tvmonitor». The application can be used handy in a mobile phone or any other smart device with minimal computational resources ie.., no connectivity to the internet. The application challenges speed and illuminacy. Effective results will help in real-time detection of traffic signs and pedestrians from a moving vehicle. This goes hand in hand with the similar intelligence in cameras which can be used as an artificial eye, and can be used in many areas such as surveillance, Robotics, Traffic, facial recognition, etc.
-
-
-
An integrated multiparametric system for infrastructure monitoring and earlywarning based on internet of things
Authors: Farid Toauti, Damiano Crescini, Alessio Galli and Adel Ben MnaouerOur daily life strictly depends on distributed civil and industrial infrastructures in which Qatar and other heavily industrialized countries have large investments. Recently, fails in such infrastructures have incurred enormous economic losses and development disruptions as well as human lives. Infrastructures are strategic assets for a sustainable development requiring correct management. To this end, their health levels and serviceability should be continuously assessed. Geophysical and mechanical quantities that determine such serviceability are for instance tilt angles, vibration levels, applied forces, stress, and existence of previous structural defects. It follows that for a feasible serviceability assessment, appropriate sensing and data processing of those parameters have to be achieved. For example, bridges are monitored for structure movements and stress level while earthquake early warning systems detect primary seismic waves before arrival of strong waves. In case of riverbank conservation, water level must be monitored together with the associated mass flow for load estimation. In addition, precipitation rate and groundwater level are paramount indicators to anticipate slope fault. Finally, strain/temperature measurement can be used to sense the health of concrete gravity or arch dams. End-users, engineers and owners can take the most appropriate decisions based on the sensed parameters. The Structural Health Assessment (SHA) is not straightforward. The structural condition is generally complex in terms of architectural parameters like damage existence, distributed masses, damping factors, stiffness matrices, and/or applied distributed forces. The above factors make such SHA extremely difficult and/or exceptionally expensive. With the aim to alleviate this difficulty, possible approaches in SHA are based on vibration measurements. The analysis of such measurements reveals the structure dynamic behaviour, which in turn reflects the characteristics and distributed forces on structures. Also, structural soundness is obtained by the estimation of the inverse analyses of the dynamic performance. However, this dynamic behaviour that is inherently complex in both time and/or spatial scale, is more complicated by the fact that for example deterioration/damage/erosion is essentially a local phenomenon. Commonly, technicians with specific domain knowledge achieve SHAs manually. Obviously, this incurs high costs and inadequate monitoring frequency. Also, there is a high probability of making errors due to improper positioning of the instrumentation or to mere mistakes during data collection. Nevertheless, for commonly large buildings (e.g. towers, general buildings, bridges and tunnels), data from just few distributed sensors cannot accurately fulfil the SHA. Consequently, the use of dense distributed sensors working at a sufficiently high sampling frequency becomes a must. Physical wiring of the site under observation is impractical due to cost and architectural constraints. Thus, for Structural Health Monitoring (SHM), networks of dense distributed sensors, which are wirelessly connected, become imperative. When a copious number of transducers are adopted, wireless communication appears to be attractive. Also, the high cost needed for the installation of wired sensors can be strongly reduced by employing wireless sensors. The present research the authors implemented a WSN-based approach for widespread monitoring without forcing intolerable boundary conditions, i.e., requiring wiring the measuring nodes, triggering manual data collection or imposing strong modifications to the site before the deployment of the sensory hardware (less intrusive). In view of the above discussion, the investigators explored some key issues on the above challenges by referring to several SHM engineering paradigms. The author designed a novel multi-parametric system dedicated to stability monitoring and control of soils, engineering works (e.g. bridges, stadium, tunnels), underground rail tunnels, offshore platform in order to continuously evaluate danger levels of potentially instable areas. The proposed system can be assembled ‘in situ’ structuring an underground-instrumented column, where different modules are joined together on a digital bus (e.g. via RS485 or CANBUS communication). Each module contains up to ten different sensors (e.g. accelerometers, magnetometers, inclinometers, extensometers, temperature sensors, and piezometers) and an electronic board for data collection, conversion, filtering and data transmission. Special flexible joints that permit strong, continuous adaptability to bends and twists of the drilling hole, link the modules. A control unit, installed outside the ground provides the readings at regular time intervals and it is connected to other columns via wireless communication forming a wide network. In particular, the proposed approach allows both analysing the response of the infrastructure to vibrations on the fly, so an early warning signal can be triggered, and saving the corresponding measurements for further analysis. Authors believe that this proposal is original and unique in three aspects. First, as most of the earlier studies on SHM were carried out by adapting existing hardwired solution for snap shot measurements rather than representative long-term monitoring, our proposal presents the first initiative to develop green WSN technologies applied to sustainable SHM applications. Second, it will develop tailored sensor technology and new techniques for SHM taking into account metrological and physical parameters such as resolution, cost, accuracy, size, and power consumption. Third, the project will commission a novel multi-parametric SHM system, which can be customized to other areas (e.g. environmental monitoring, traffic monitoring, etc.). The research is to support innovations at system and component levels leading to out-of-the-box know-how. The proposed solution is based on novel/customized sensors and data processing, environmentally powered communication platform, and communication networks and algorithms embracing the visionary nature of the IoT with out-of-the-box solutions. Specific outcomes have been experimental proof-of-concept, through testing and prototyping, of a tailored SHM sensor technology and smart techniques that uniquely provide self-calibration and self-diagnostics of faults, a multi-sensor viable instrumented column for SHM with advanced techniques, and environmentally powered wireless platform with innovative MAC protocols (power-aware, context-aware, cognitive and polymorphic). This work employs tools and techniques of modern sensing, processing, and networking in order to generate novel SHM solutions that uniquely provide precision measurement, green IoT-based communication approach, viability, and cost-effectiveness.
-
-
-
Learning Spatiotemporal Latent Factors of Traffic via a Regularized Tensor Factorization: Imputing Missing Values and Forecasting
Authors: Abdelkader Baggag, Tahar Zanouda, Sofiane Abbar and Fethi FilaliSpatiotemporal data related to traffic has become common place due to the wide availability of cheap sensors and the rapid deployment of IoT platforms. Yet, this data suffer several challenges related to sparsity, incompleteness, and noise, which makes traffic analytics difficult. In this paper, we investigate the problem of missing data or noisy information in the context of real-time monitoring and forecasting of traffic congestion for road networks. The road network is represented as a directed graph in which nodes are junctions and edges are road segments. We assume that the city has deployed high-fidelity sensors for speed reading in a subset of edges. Our objective is to infer speed readings for the remaining edges in the network as well as missing values to malfunctioning sensors. We propose a tensor representation for the series of road network snapshots, and develop a regularized factorization method to estimate the missing values, while learning the latent factors of the network. The regularizer, which incorporates spatial properties of the road network, improves the quality of the results. The learned factors along with a graph-based temporal dependency are used in an autoregressive algorithm to predict the future state of the road network with long horizon. Extensive numerical experiments with real traffic data from the cities of Doha(Qatar) and Aarhus (Denmark) demonstrate that the proposed approach is appropriate for imputing missing data and predicting traffic state.Main contributions. The main contributions are:We propose a novel temporal regularized tensor factorization framework (TRTF) for high-dimensional traffic data. TRTF provides a principled approach to account for both the spatial structure and the temporal dependencies.We introduce a novel data-driven graph-based autoregressive model, where the weights are learned from the data. Hence, the regularizer can account for both positive and negative correlations.We show that incorporating temporal embeddings into CP-WOPT leads to accurate multi-step forecasting, compared to state of the art matrix factorization based methods.We conduct extensive experiments on real traffic congestion datasets from two different cities and show the superiority of TRTF for both tasks of missing value completion and multi-step forecasting under different experimental settings. For instance,TRTF outperforms LSM-RN by 24% and TRMF by 29%.Conclusion. We present in this paper TRTF, an algorithm for temporal regularized tensor decomposition. We show how the algorithm can be used for several traffic related tasks such as missing value completion and forecasting. The proposed algorithm incorporates both spa-tial and temporal properties into the tensor decomposition procedures such as CP-WOPT, yielding to learning better factors. We also, extend TRTF with an auto-regressive procedure to allow for multi step-ahead forecasting of future values. We compare our method to recently developed algorithms that deal with the same type of problems using regularized matrix factorization,and show that under many circumstances, TRTF does provide better results. This is particularly true in cases where the data suffers from high proportions of missing values, which is common in the traffic context. For instance, TRTF achieves a 20% gain in MAPE score compared to the second best algorithm (CP-WOPT) in completing missing values in the case of extreme sparsity observed in Doha. As future work, we will first focus on adding non-negativity constraints to TRTF, although the highest fraction of negative values generated by our method throughout all the experiments did not exceed 0.7%. Our second focus will be to optimize TRTF training phase in order to increase its scalability to handle large dense tensors, and to implement it on a parallel environment.
-
-
-
A Deep Learning Approach for Detection of Electricity Theft Cyber Attacks in Smart Grids
Authors: Muhammad Ismail, Mostafa Shahin, Erchin Serpedin and Khalid QaraqeFuture smart grids rely on advanced metering infrastructure (AMI) networks for monitoring and billing purposes. However, several research works have revealed that such AMI networks are vulnerable to different kinds of cyber attacks. In this research work, we consider one type of such cyber attacks that targets electricity theft and we propose a novel detection mechanism based on a deep machine learning approach. While existing research papers focus on shallow machine learning architectures to detect these cyber attacks, we propose a deep feedforward neural network (D-FF-NN) detector that can thwart such cyber attacks efficiently. To optimize the D-FF-NN hyper-parameters, we apply a sequential grid search technique that significantly improves the detector»s performance while reducing the associated complexity in the learning process. We carry out extensive studies to test the proposed detector based on a publicly available real load profile data of 5000 customers. The detector»s performance is investigated against a mixture of different attacks including partial reduction attacks, selective by-pass attacks, and price-based load control attacks. Our study reveals that the proposed D-FF-NN detector presents a superior performance compared with state-of-the-art detector»s that are based on shallow machine learning architectures
-
-
-
A simple and secure framework for protecting sensitive data stored on the cloud
Authors: Elias Yaacoub, Ali Sakr and Hassan NouraIn the past decade, Cloud-Computing emerged as a new computing concept with a distributed nature using virtual network and systems. Many businesses rely on this technology to keep their systems running but concerns are rising about security breaches in cloud computing. This work presents a secure approach for storing data on the cloud. the proposed methodology is described as follows: 1) The client who wants to store data on the cloud subscribes with n cloud providers (CPs). 2) A file F to be stored on the cloud is subdivided into n parts, or subfiles: F1, F2,.... Fn. 3) Each part is encrypted with an encryption key Kf. The encrypted parts are denoted by F1*, F2*,…, Fn*. 4) A random permutation vector P(f) is generated. 5) The encrypted parts are stored on the n clouds according to P(F); in other words, due to P(F), F1* could be stored on CP3 for example F2* on CPn, etc. 6) In order to be able to retrieve his files, the client needs to maintain some information related to the distribution of the various parts and to the encryption of the file. Thus, he maintains two tables. 7) The first table contains a hash of the file name H(F_name), and the key Kf. 8) The second table contains a hash of the file name H(F_name), a hash of the file content itself (unencrypted) H(F), a hash of the encrypted file content H(F*), and the permutation vector P(F), encrypted with Kf. 9) The two tables are stored on different servers protected by advanced security measures, and preferably located at different locations. 10) In order to obtain the file, the client enters the file name. Then the system computes the hash value of the name, finds the corresponding entry in Table 1, and obtains the key. Then, the corresponding entry in Table 2 is found. The key, obtained from Table 1, is used to decrypt the permutation vector P(f). Then, the encrypted parts are downloaded from the different cloud providers. Afterwards, they are assembled in the correct order and decrypted. The hash values of the encrypted and unencrypted versions are then computed and compared to their corresponding values stored in Table2 in order to check for the integrity of the file downloaded from the cloud. This approach allows the client to use the same storage space on the cloud: Instead of using a single cloud provider to store a file of size S bits, the client is using n cloud providers, where he stores S/n bits with each cloud provider. Thus, the storage costs of the two methods are comparable. If the client wishes to introduce redundancy in the file, such that he can recover the whole file from j parts instead of n parts, with j< = n, then redundancy can be added to the original file as appropriate. In this case, the storage costs will increase accordingly, but this is an added enhancement that can be used with or without the proposed approach. On the other hand, the overhead due to the proposed approach consists of maintaining two tables containing the information relevant to the file. The storage required to maintain the entry corresponding to each file in these two tables is small compared to a typical file size: in fact, we only need to store a few hash values (of fixed size), along with the encryption key. This seems a reasonable price to pay for a client that has sensitive data that he cannot post unencrypted on the cloud, or even posting it encrypted with a single provider is risky in case a security breach occurs at the provider»s premises.
-
-
-
Substring search over encrypted data
More LessOur data, be it personal or professional, is increasingly outsourced. This results from the development of cloud computing in the past ten years, a paradigm that shifts computing to a utility. Even without realizing it, cloud computing has entered our lives inexorably: every owner of a smartphone, every user of a social network is using cloud computing, as most IT companies and tech giants in particular are using infrastructure as a service to offer services in the model of software as a service. These services (dropbox, google, facebook, twitter…) are simple to use, flexible…and free! Users just send their data and they get all services without paying. Actually, these companies are making most of their revenues by profiling the users thanks to the data that the users willingly provide. The data is the indirect payment to benefit from these services. This raises privacy concerns at the personal level, as well as confidentiality issues for sensitive documents in a professional environment. The classical way of dealing with confidentiality is to conceal the data through encryption. However, cloud providers need access to data in order to provide useful services, not only to profile users. Take a cloud email service as example, where the emails are stored and archived in the cloud and only downloaded to the user's phone or computer when the user wants to read them. If the emails are encrypted in the cloud, the cloud cannot access them and confidentiality is enforced. However, the cloud can also not provide any useful service to the user such as a search functionality over emails. To meet these conflicting requirements (hiding the data and accessing the data) a solution is to develop mechanisms that allow computation on encrypted data. While generic protocols for computation on encrypted data have been researched developed, such as Gentry's breakthrough fully homomorphic encryption, their performance remains unsatisfactory. On the contrary, tailoring solutions to specific needs result in more practical and efficient solution. In the case of searching over encrypted data, searchable encryptions algorithms have been developed for over decade and achieve now satisfactory performance (linear in the size of the dictionary). Most of the work in this field focus on single keyword search in the symmetric setting. To overcome this limitation, we first proposed a scheme based on letter orthogonalization that allows testing of string membership by performing efficient inner products (AsiaCCS 2013). Going further, we now propose a general solution to the problem of efficient substring search over encrypted data. The solution enhances existing “keyword” searchable encryption schemes by allowing searching for any part of encrypted keywords without requiring one to store all possible combinations of substrings from a given dictionary. The proposed technique is based on the previous idea of letter orthogonalization. We first propose SED-1, the base protocol for substring search. We then identify some attacks on SED-1 that demonstrate the complexity of the substring search problem under different threat scenarios. This leads us to propose our second and main protocol SED-2. The protocol is also efficient in that the search complexity is linear in the size of the keyword dictionary. We run several experiments on a sizeable real world dataset to evaluate the performance of our protocol. This final work has been accepted for publication in the IOS journal of computer security https://content.iospress.com/articles/journal-of-computer-security/jcs14652.
-
-
-
Almost BPXOR Coding Technique for Tolerating Three Disk Failures in RAID7 Architectures
Authors: Naram Mhaisen, Mayur Punkar, Yongge Wang, Yvo Desmedt and Qutaibah MalluhiRedundant Array of Independent Disks (RAID) storage architectures provide protection of digital infrastructure against potential disks failures. For example, RAID-5 and RAID-6 architectures provide protection against one and two disk failures, respectively. Recently, the data generation has significantly increased due to the emergence of new technologies. Thus, the size of storage systems is also growing rapidly to accommodate such large data sizes, which increases the probability for disks failures. This necessitates a new RAID architecture that can tolerate up to three disk failures. RAID architectures implement coding techniques. The code specifies how data is stored among multiple disks and how lost data can be recovered from surviving disks. This abstract introduces a novel coding scheme for new RAID-7 architectures that can tolerate up to three disks failures. The code is an improved version of the existing BP-XOR code and is called “Almost BP-XOR”.There are multiple codes that can be used for RAID-7 architectures. However, [5,2] BP-XOR codes have significantly lower encoding and decoding complexities than most common codes [1]. Regardless of this fact, this code does not achieve the fastest data decoding and reconstruction speeds due to its relatively low efficiency of 0.4. Furthermore, the existence of MDS [6,3] bx6 BP-XOR codes, b>2 (which achieves efficiency of 0.5) is still an open research question. This work proposes [6,3] 2 x 6 Almost BP-XOR codes. These codes largely utilize the simple and fast BP-XOR decoder while achieving an efficiency of 0.5, leading to the fastest recovery from disk failures among other state-of-the-art codes. An algorithm to generate a [6, 3] 2 x 6 Almost BP-XOR code has been developed and an example code is provided in Table 1. The [6, 3] 2 x 6 Almost BP-XOR codes are constructed in a way that any three-column-erasure pattern will result in one of the following two main scenarios. First: At least one of the surviving degree-three encoding symbols contains two known information symbols. This scenario occurs in 70% of three-column erasure cases (i.e. 14 out of the 20 possible cases). The recovery process in such scenario is identical to that of the BP-XOR codes; Knowing any two information symbols in a degree-three encoding symbol is sufficient to know the third information symbol through performing a simple XOR operation. Second: None of the surviving degree-three encoding symbols contains two known information symbols. This scenario occurs in the remaining 30% of three-column erasure cases (i.e., 6 out if the possible 20). The BP-XOR decoder fails in such a scenario. However, due to the construction of the codes, at least one surviving degree-three encoding symbol contains a known information symbol. Thus, knowing one of the reaming two information symbols in such a degree-three encoding symbol will initiate the BP-XOR decoder again.Table 2 shows these erasure patterns along with an expression for one of the missing information symbols. these expressions can be stored in buffers and used whenever the corresponding erasure pattern occurs. Solutions in Table 2 are derived from the inverse of a 6x6 submatrix that results from a generator matrix G by deleting columns from G corresponding to erased code columns. The read complexity of almost BP-XOR codes is 1. On the other hand. The decoding of almost BP-XOR codes require just 6 XOR operations when for a given three-column-erasure pattern BP-XOR decoding succeeds. However, when the BP- XOR decoder fails, it will require up to 15 XOR operations in total. The normalized repairing complexity is 15/6 = 2.5.Experimentally, Fig.1 shows that the proposed Almost BP-XOR codes require the least amount of time to decode and reconstruct erased columns. Thus, it is concluded that the [6, 3] 2x6 almost BP-XOR codes are best suited for RAID-7 system that requires storage efficiency of 0.5. References [1] Y. Wang. “Array BP-XOR codes for reliable cloud storage systems,” in Proc. of the 2013 IEEE International Symposium on Information Theory (ISIT), pp. 326–330, Istanbul, Turkey, July 2013.NoteFigures, Tables, and more details are provided in the complete attached file titled Abstract-ARC18.pdf, (respecting the same word count restriction).
-
-
-
Leveraging Online Social Media Data for Persona Profiling
Authors: Bernard J. Jansen, Soon-gyo Jung, Joni Salminen, Jisun An and Haewoon KwakThe availability of large quantities of online data affords the isolation of key user segments based on demographics and behaviors for many online systems. However, there is an open question of how organizations can best leverage this user information in communication and decision-making. The automatic generation of personas to represent customer segments is an interactive design technique with considerable potential for product development, policy decision, and content creation. A persona is an imaginary but characteristic person that is the representation of a customer, audience, or user segment. The representative segment shares common characteristics in terms of behavioral attributes or demographics. Representing a user segment, a persona is generally developed in the form of a detailed profile narrative, typically expressed in one or two pages, about a representative but an imaginary individual that embodies the collection of users with similar behaviors or demographics. In order to make the fictitious individual appear as a real person to system developers and other decision-makers, the persona profile usually comprises a variety of demographic and behavioral details, such as socioeconomic status, gender, hobbies, family members, friends, possessions, among other data and information. Along with this data, the persona profiles typically address the goals, needs, wants, frustrations and other attitudinal aspects of the fictitious individual that are relevant to the product being developed and designed. Personas have typically been fairly static once created by using manual, qualitative methods. In this research, we demonstrate a data-driven approach for creating and validating personas in real time, based on automated analysis of actual user data. Using a variety of data collection sites and research partners from various verticals (digital content, non-profits, retail, service, etc., we are specifically interested in understanding the users of these organizations by identifying (1) whom the organizations are reaching (i.e., user segment) and (2) what content are associated with each user segment. Focusing on one aspect of user behavior, we collect tens of millions of instances of interaction by users to online content, specifically examining the topics of content interaction. We then decompose the interaction patterns, discover related impactful demographics, and add personal properties; this approach creates personas based on these behavioral and demographic aspects that represent the core user segments for each organization. We conduct analysis to remove outliers and use non-negative matrix factorization to identify first the meaningful behavioral patterns and then the impactful demographic groupings. We then demonstrate how these findings can be leveraged to generate real-time personas based on actual user data to facilitate organizational communication and decision-making. Demonstrating that these insights can be used to develop personas in near real-time, the research results provide insights into user segmentation, competitive marketing, topical interests, and preferred system features for the users. Overall, research implications are that personas can be generated in near real-time representing the core users groups of online products.
-
-
-
A Fast and Secure Approach for the Transmission of Monitoring Data over MultiRATs
Authors: Elias Yaacoub, Rida Diba and Hassan NouraIn an mHealth remote patient monitoring scenario, usually control units/data aggregators receive data from the body area network (BAN) sensors then send it to the network or “cloud”. The control unit would have to transmit the measurement data to the home access point (AP) using WiFi for example, or directly to a cellular base station (BS), e.g. using the long-term evolution (LTE) technology, or both (e.g. using multi-homing to transmit over multiple radio access technologies (Multi-RATs). Fast encryption or physical layer security techniques are needed to secure the data. In fact, during normal conditions, monitoring data can be transmitted using best effort transmission. However, when real-time processing detects an emergency situation, the current monitoring data should be transmitted real-time to the appropriate medical personnel in emergency response teams. The proposed approach consists of benefiting of the presence of multi-RATs in order to exchange the secrecy information more efficiently while optimizing the transmission time. It can be summarized as follows (assuming there are two RATs):1) The first step is to determine the proportion of data bits to be transmitted over each RAT in order to minimize the transmission time, given the data rates achievable on each RAT. Denoting the data rates by R1, and R2, and the total number of bits to be transmitted by D = SUM(D1,D2), where D1 and D2 are the number of bits to be transmitted over RAT1 and RAT2 respectively, then they should be selected such that D1/R1 = D2/R22) Then, the exchange of the secrecy parameters between sender and receiver is done over the two RATs in order to maintain the security of the transmission. To avoid the complexity of public key cryptography, a three way handshake can be used: 2-1) The sender decides to divide the data into n parts, with a fraction n1 sent on RAT1 and a fraction n2 sent on RAT2, according to the ratios determined in 1) above (i.e. the sum of the bits in the n1 parts should be close to D1 bits, and the sum of the n2 parts should be close to D2 bits) 2-2) The sender generates a scrambling vector P(D,n) to scramble the n data parts and transmit them out of order. 2-3) The sender groups the secret information consisting of S = {n, n1, n2, P(D,n)}, and could add additional information to protect against replay attacks, e.g. timestamp, nonce, etc., and sends this information on the two RATs, encrypted by a different key: K11 on RAT1 and K12 on RAT2. Thus, {S}_K11 is sent on RAT1 and {S}_K12 is sent on RAT2. 2-4) The receiver does not know K11 and K12. Thus, it encrypts the received information with two other keys K21 (over RAT1) and K22 (over RAT2) and sends them back: {{S}_K11,K21} is sent on RAT1 and {{S}_K12,K22} is sent on RAT2. 2-5) The sender decodes the received encrypted vectors using his keys and sends back {S}_K21 on RAT1 and {S}_K22 on RAT2. The secret information is still securely encoded by the receiver's secret keys K21 and K22. 2-6) The receiver can now decrypt the information and obtain S. 3. The two parties can now communicate using the secret scrambling approach provided by S. This information can be changed periodically as needed. For example, if the data is subdivided over 10 parts, with 40% to be sent over LTE and 60% to be sent over WiFi, according to the scrambling vector P(D,n) = {4,1,10,7,3,9,5,2,6}, then parts {4,1,10,7} are sent over LTE and parts {7,3,9,5,2,6} are sent over WiFi. The receiver will sort them out in the correct order.
-
-
-
Crowdsourced MultiView Live Video Streaming using Cloud Computing
Authors: Aiman Erbad and Kashif BilalMulti-view videos are composed of multiple video streams captured simultaneously using multiple cameras from various angles (different viewpoints) of a scene. Multi-view videos offer more appealing and realistic view of the scene leading to higher user satisfaction and enjoyment. However, displaying realistic and live multiview scenes captured from a limited view-points faces multiple challenges, including excessive number of precise synchronization of many cameras, color differences among cameras, large bandwidth, computation and storage requirements, and complex encoding. current multi-view video setups are very limited and based in studios. We propose a novel system to collect individual video streams (views) captured for the same event by multiple attendees, and combine them into multi-view videos, where viewers can watch the event from various angles, taking crowdsourced media streaming to a new immersive level. The proposed system is called Cloud based Multi-View Crowdsourced Streaming (CMVCS), and it delivers multiple views of an event to viewers at the best possible video representation based on each viewer's available bandwidth. CMVCS is a complex system having many research challenges. In this study, we focus on resource allocation of the CMVCS system. The objective of the study is to maximize the overall viewer satisfaction by allocating available resources to transcode views in an optimal set of representations, subject to computational and bandwidth constraints. We choose the video representation set to maximize QoE using Mixed Integer Programming (MIP). Moreover, we propose a Fairness Based Representation Selection (FBRS) heuristic algorithm to solve the resource allocation problem efficiently. We compare our results with optimal and Top-N strategies. The simulation results demonstrate that FBRS generates near optimal results and outperforms the state-of-the-art Top-N policy, which is used by a large scale system (Twitch). Moreover, we consider region based distributed datacenters to minimize the overall end-to-end latency. To further enhance the viewers’ satisfaction level and Quality of Experience (QoE), we propose an edge based cooperative caching and online transcoding strategy to minimize the delay and backhaul bandwidth consumption. Our main research contributions are: We present the design and architecture of a Cloud based Multi-View Crowdsourced Streaming (CMVCS) system that allows viewers to experience the captured events from various angles. We propose a QoE metric to determine the overall user satisfaction based on the received view representation, viewers’ bandwidth capability, and end-to-end latency between viewer and transcoding site. We formulate a Mixed Integer Programming (MIP) optimization problem for multi-region distributed resource allocation to choose the optimal set of views and representations to maximize QoE in constrained settings. We propose a fairness based heuristic algorithm to find near optimal resource allocation efficiently. We propose an edge computing based video caching and online transcoding strategy to minimize delay and backhaul network consumption. We use multiple real-world traces to simulate various scenarios and show the efficiency of the proposed solution.
-