- Home
- Conference Proceedings
- Qatar Foundation Annual Research Conference Proceedings
- Conference Proceeding
Qatar Foundation Annual Research Conference Proceedings Volume 2014 Issue 1
- Conference date: 18-19 Nov 2014
- Location: Qatar National Convention Center (QNCC), Doha, Qatar
- Volume number: 2014
- Published: 18 November 2014
341 - 360 of 480 results
-
-
Energy Storage System Sizing For Peak Hour Utility Applications In Smart Grid
Authors: Islam Safak Bayram, Mohamed Abdallah and Khalid QaraqeEnergy Storage Systems (ESS) are expected to play a critical role in future energy grids. ESS technologies are primarily employed for reducing the stress on grid and the use of hydrocarbons for electricity generation. However, in order for ESS option to become economically viable, proper sizing is highly desired to recover the high capital cost. In this paper we propose a system architecture that enables us to optimally size the ESS system according to the number of users. We model the demand of each customer by a two-state Markovian fluid and the aggregate demand of all users are multiplexed at the ESS. The proposed model also draws a constant power from the grid and it is used to accommodate the customer demand and charge the storage unit, if required. Then, given the population of customers and their stochastic demands, and the power drawn from the grid we provide an analytical solution for ESS sizing using the underflow probability as the main performance metric, which is defined as the percentage of time that the system resources fall short of demand. Such insights very important in designing the system planning phases of future energy grid infrastructures.
-
-
-
An Enhanced Dynamic-programming Technique For Finding Approximate Overlaps
Authors: Maan Haj Rachid and Qutaibah MalluhiThe next generation sequencing technology creates a huge number of sequences (reads), which constitute the input for genome assemblers. After prefiltering the sequences, it is required to detect exact overlaps between the reads to prepare the necessary ingredients to assemble the genome. The standard method is to the find the maximum exact suffix-prefix match between each pair of reads after executing an error-detection technique. This is applied in most assemblers, however, a few studies worked on finding the approximate overlap. This direction can be useful when error detection and prefiltering techniques are very time consuming and not very reliable. However, there is a huge difference in term of complexity between finding exact and approximate matching techniques. Therefore, any improvement in time could be valuable when approximate overlap is the target. The naive technique to find approximate overlaps applies a modified version of dynamic programming (DP) on every pair of reads, which consumes O(n2) time where n is the total size of all reads. In this work, we take advantage of the fact that many reads share prefixes. Accordingly, it is obvious that some work is continuously repeated. For example, consider the sequences in Figure 1. If dynamic programming is applied on S1 and S2, assuming S2 and S3 share a prefix of length 4, then it is easy to notice that calculation of a portion of DP table of size |S1| X 5 can be avoided when applying the algorithm on S1 and S3 (the shaded area in Figure 1). Figure 1. DP table for S1,S2 alignment. We assume the following: gap = 1, match =0 and mismatch=1. no calculation for the shaded area is required when calculating S1,S3 table since S2,S3 share the prefix AGCC. The modification is based on the above observation: first, the reads are sorted in lexicographical order and the largest common prefix (LCP) between every two consecutive reads is found. Let group G denote the reads after sorting. For every string S, we find the DP table for S and every other string in G. Since the reads are sorted, a portion of DP table can be skipped for every string, depending on the size of LCP, which has already been calculated in the previous step. We implemented the traditional technique to find approximate overlap with and without the proposed modification. The results show that there is an improvement of 10-61% in time. The interpretation for this wide range is that the gain in performance depends on the number of strings. The larger the number of strings is, the better the gain in performance since the sizes of LCPs are typically larger.
-
-
-
Practical Quantum Secure Communication Using Multi-photon Tolerant Protocols
More LessThis paper presents an investigation of practical quantum secure communication using multi-photon tolerant protocols. Multi-photon tolerant protocols loosen the limit on the number of photons imposed by currently used quantum key distribution protocols. The multi-photon tolerant protocols investigated in this paper are multi-stage protocols that do not require any prior agreement between a sender Alice and a receiver Bob. The security of such protocols stems from the fact that the optimal detection strategies between the legitimate users and the eavesdropper are asymmetrical, allowing Bob to obtain measurement results deterministically while imposing unavoidable quantum noise to the eavesdropper Eve's measurement. Multi-photon tolerant protocols are based on the use of transformations known only to the communicating party applying them i.e. either Alice or Bob. In this paper multi-photon tolerant protocols are used in order to share a key or a message between a sender Alice and a receiver Bob. Thus such protocols can be either used as quantum key distribution (QKD) protocols or quantum communication protocols. In addition, multi-stage protocols can be used to share a key between Alice and Bob, followed by the shared key used as a seed key to a single-stage protocol, called the braiding concept. This paper presents a practical study of multi-photon tolerant multi-stage protocols. Security aspects as well as challenges to the practical implementation are discussed. In addition, secret raw key generation rates are calculated with respect to both losses and distances over a fiber optical channel. It is well-known that raw key generation rates decreases with the increase in channel losses and distances. In this paper, coherent non-decoying quantum states are used to transfer the encoded bits from Alice to Bob. Raw key generation rates are calculated for different average photon numbers µ and compared with the case of µ=0.1, which is the average number of photons used in most single-photon based QKD protocols. Furthermore, an optimal average number of photons to be used within the secure region of the multi-photon tolerant protocols is calculated. It is worth noting that, with the increased key generation rates and distances of communication offered by the multi-photon tolerant protocols, quantum secure communication need not be restricted to quantum key distribution; it can be elevated to attain direct quantum secure communication.
-
-
-
Power Grid Protection
Authors: Enrico Colaiacovo and Ulrich OttenburgerDue to its inherent short-term dynamic, the power grid is a critical component of the energy system. When a dangerous event occurs in a section of the grid (i.e. a power line or a plant fails), the overall system is subject to the risk of a blackout. The time available to counteract the risk is very short (only a few milliseconds) and there are no tools to ensure the power to a number of selected critical facilities. A way to tackle the blackout risk and to implement a smart management of the remaining part of the grid is a distributed control system with preemptive commands. It's based on the idea, that in case of dangerous events, there will be definetly no time to inform the control center, to make a decision and to send the commands to the active components of the power grid where, finally, they will be executed. The idea consist in the implementation of an intelligent distributed control system that continuously controls the critical components of the power grid. It monitors the operational conditions and evaluates the ability of single components to work well and their probability of an outage. In parallel, the control system continuously imparts preemptive commands to eventually counteract the outages expected on a probabilistic base. The preemptive commands can be defined taking into account the sensitivity to specific outages by different network elements and of course, on the base of a priority rule that preserve the power for the strategic sites. In case of a dangerous event, the monitoring device directly sends messages to all the actuator devices, where the action will be performed only if a preemptive command was previously delivered. This means that the latency of the traditional control chain will be reduced to the latency of communications between monitoring and actuator devices. The first consequence of this policy is that an event, which is potentially the cause for a complete blackout will affect only a limited portion of the grid. The second consequence is , that the control system will choose the network elements that will be involved in the emergency procedure, preserving the strategic plants. The third consequence is that with this kind of control, the power grid goes from a N-1 stable status to another N-1 stable status. The system loses contributions of generation and load, but it keeps its stability and its standard operations.
-
-
-
A Conceptual Model For Tool Handling In The Operation Room
Authors: Juan Wachs and Dov DoriBackground & Objectives: There are 98,000 deaths in the US annually due to errors in the delivery of healthcare causing inpatient mortality and morbidity. Among these errors, ineffective team interaction in the operating room (OR) accounts for one of the main causes. Recently, it has been suggested that developing a conceptual model of verbal and non-verbal exchanges in the OR could lead to a better understanding of the dynamics among the surgical team, and this in turn, could result in a reduction in miscommunication in the OR. In this work, we describe the main principles characterizing the Object-Process Methodology (OPM). This methodology enables to describe the complex interactions between surgeons and the surgical staff while delivering surgical instruments during a procedure. The main objective of such a conceptual model is to assess when and how errors occur during the requests and delivery of instruments, and how to avoid those. Methods: The conceptual model was constructed from direct observations of surgical procedures and eventual miscommunications cases in the OR. While the interactions in the OR are rather complex, the compact ontology of OPM allows stateful objects and processes to interact mutually and generate measurable outcomes. The instances modeled are related to verbal and non-verbal communication (e.g. gestures, proxemics) and the potential mistakes are modeled as processes that deviate for the “blue ocean” scenario. The OPM model was constructed through an iterative process of data collection through observation, modeling, brainstorming, and synthesis. This conceptual model provides the basis for new theories and frameworks needed to characterize operating OR communication. Results: The model adopted can accurately express the intricate that take place in the OR during a surgical procedure. A key component of the conceptual model is the ability to specify the features at various levels of detail, and each level represented through a different diagram. Nevertheless, each diagram is contextually linked to all the others. The resulting model, thus, provides a powerful and expressive ontology of verbal and non-verbal communication exchanges in the OR. Concretely, the model is validated through structured questionnaires, which allows assessing the level of consensus for criteria such as flexibility, accuracy, and it generality. Conclusion: A conceptual model was presented describing the tools handling processes during operations conducted at the OR. The focus is placed on communication exchanges between the main surgeon and the surgical technician. The objective is to create a tool to "debug" and identify the exact circumstances in which surgical delivery errors can happen. Our next step is the implementation of robotic assistant for the OR, which can deliver and retrieve surgical instruments. A necessary requirement for the introduction of such cybernetic solution is the development of a concise specification of these interactions in the OR. The development of this conceptual model can have a significant impact in both the reduction in tool-handling-related errors, and the formal designing robots which could complement surgical technicians in their routine tool handling activities during surgery.
-
-
-
Efficient Multiple Users Combining And Scheduling In Wireless Networks
Authors: Mohammad Obaidah Shaqfeh and Hussein AlnuweiriWireless networking plays a vital role in our daily life style and has tremendous applications in almost all fields of the economy. The wireless medium is a shared medium and, hence, user-scheduling is needed to allow multiple users access the channel jointly. Furthermore, the wireless channel is characterized by its time-based and location-based variations due to physical phenomena such as multi-path propagation and fading, etc. Therefore, unlike the traditional persistent round-robin scheduling schemes, the current standards of telecommunication systems support channel-aware opportunistic scheduling in order to exploit the varying channels of the users when they are at their peak conditions. The advantages of these schemes in enhancing the prospected throughput of the networks are evident and demonstrated. However, these schemes are basically based on selecting a single user to access a certain frequency sub-channel at a given time in order to avoid creating interference if more than one user access the same channel. Nevertheless, allowing multiple users to access the same channel can be feasible by using special coding techniques such as superposition coding with successive interference cancellation at the receivers. The main advantage of this is to improve the spectral efficiency of the precious wireless spectrum and to enhance the overall throughput of the network while maintaining the quality-of-service requirements of all users. Despite their advantages, multiple-users scheduling schemes require the use of proper resource allocation algorithms to process the channel conditions measurements in order to decide which users should be served in a given time slot and frequency sub-channel and the allocated data rate and power of each link in order to maximize the transmission efficiency. Failure to use a suitable resource allocation and scheduling scheme can degrade the performance significantly. We design and analyze the performance of efficient multiple-users scheduling schemes for wireless networks. One scheme is proven theoretically to be the most efficient one. However, the algorithm computation load is significant. The other scheme is a sub-optimal scheme that has low computation load to run the algorithm and it achieves very good performance which is comparable to the optimal scheme. Furthermore, we evaluate the performance gains of multiple-user scheduling over the conventional single-user scheduling under different constraints such as hard fairness and proportional fairness among the users and for fixed merit weights of the users based on their service class. In all of these cases, our proposed schemes can achieve a gain that may exceed 10% in terms of the data rate (bits/sec). This gain is significant taken into consideration that we use the same air-link and power resources of the conventional single-user scheduling schemes.
-
-
-
Maximizing The Efficiency Of Wind And Solar-based Power Generation By Gis And Remotely Sensed Data In Qatar
Authors: Ramin Nourqolipour and Abu Taleb GhezelsoflouQatar has a high potential to develop renewable energy generating systems spatially through solar and wind-based technologies. Although, substantial initiatives have been undertaken in Qatar to reduce the high per capita emissions of the Greenhouse Gases (GHG), solar and wind-based energy generation can also significantly contribute to the mitigation of climate change. The mean Direct Normal Irradiance (DNI) of Qatar is about 2008 kWh/m2/y, which is suitable to develop solar power systems, knowing that 1800 kWh/m2/y is enough to establish Concentrated Solar Power (CSP) plants. Although, the cost factor for developing the solar based power generation systems is about twice the gas based power generation, it generates environmental friendly energy along with keeping the limited gas resources. Moreover, being aware that 3 m/s is the critical wind speed to generate power, Qatar experiences wind speed over the critical speed in almost 80% of time that is a great potential to develop wind-based energy systems. In terms of economic feasibility, the minimum requirement of number for full load hours is 1400 while the number for Qatar is higher than the critical value. Furthermore, establishing wind power plant is cheaper than the gas-based one in off-shore locations even though the power generation is lower. This paper explains a methodology to determine the most suitable sites for developing the solar and wind-based power plants in order to maximize the efficiency of power generation using remote sensing and GIS. Analyses are carried out on two sets of spatial data derived from a recent Landsat 8 image such as land cover, urban and built-up areas, roads, water sources, and constraints, along with bands 10 and 11 (thermal bands) of same sensor for the year 2014, a DEM (Digital Elevation Model) derived from SRTM V2 (Shuttle Radar Topography Mission) to generate slope, aspect, and solar maps, and wind data obtained from Qatar meteorology department. The data are used to conduct two parallel Multi-Criteria Evaluation (MCE) techniques based on each objective of development (solar, and wind power plant development) through the following stages: (1) data preparation and standardization using categorical data rescaling, and fuzzy set membership function, (2) Logistic Regression-based analysis to determine suitability of each pixel for desired objective of development. The analysis produces two distinct suitability maps such that each one addresses suitable areas to establish solar, and wind power plants. The obtained suitability maps then are processed under a multi-objective land allocation model to allocate the areas that show the highest potential to develop both solar and wind-based power generation. Results show that the off-shore suitable sites for both objectives are mainly distributed in the north and north-west regions of Qatar.
-
-
-
An Efficient Model For Sentiment Classification Of Arabic Tweets On Mobiles
Authors: Gilbert Badaro, Ramy Baly, Hazem Hajj, Nizar Habash, Wassim El-hajj and Khaled ShabanWith the growth of social media and online blogs, people express their opinion and sentiment freely by providing product reviews, as well as comments about celebrities, and political and global events. These texts reflecting opinions are of great interest to companies and individuals who base their decisions and actions upon them. Hence, opinion mining on mobiles is capturing the interest of users and researchers across the world with the growth of available online data. Many techniques and applications have been developed for English while many other languages are still trying to catch up. In particular, there is an increased interest in easy access to Arabic opinion from mobiles. In fact, Arabic presents challenges similar to English for opinion mining, but also presents additional challenges due to its morphological complexity. Mobiles on the other hand present their own challenges due to limited energy, limited storage, and low computational capability. Since some of the state-of-the-art methods for opinion mining in English require the extraction of large numbers of features, and extensive computations, these methods are not feasible for real-time processing on mobile devices. In this work, we provide a solution to address the limitation of the mobile, and the required Arabic resources to derive opinion mining on mobiles. The method is based on matching stemmed tweets to our own developed Arabic sentiment lexicon (ArSenL). While there have been efforts towards building Arabic sentiment lexicons, they suffer from many deficiencies including limited size, unclear usability plan given Arabic's rich morphology, or non-availability publicly. ArSenL is the first publicly available large scale Standard Arabic sentiment lexicon (ArSenL) developed using a combination of English SentiWordnet (ESWN), Arabic WordNet, and the Standard Arabic Morphological Analyzer (SAMA). A public interface to browsing ArSenL is available at http://me-applications.com/test. The scores from the matched stems are then aggregated and processed through a decision tree for determining the polarity. The method was tested on a published set of Arabic tweets, and an average accuracy of 67% was achieved versus a 50% baseline. A mobile application was also developed to demonstrate the usability of the method. The application takes as input a topic of interest and retrieves the latest Arabic tweets related to this topic. It then displays the tweets superimposed with colors representing sentiment labels as positive, negative or neutral. The application also provides visual summaries of searched topics and a history showing how the sentiments for a certain topic has been evolving.
-
-
-
Email Authorship Attribution In Cyber Forensics
More LessEmail is one of the most widely used forms of written communication over the Internet, and its use has increased tremendously for both personal and professional purposes. The increase in email traffic comes also with an increase in the use of emails for illegitimate purposes to commit all sort of crimes. Phishing, spamming, email bombing, threatening, cyber bullying, racial vilification, child pornography, viruses and malware propagation, and sexual harassments are common examples of email abuses. Terrorist groups and criminal gangs are also using email systems as a safe channel for their communication. The alarming increase in the number of cybercrime incidents using email is mostly due to the fact that email can be easily anonymized. The problem of email authorship attribution is to identify the most plausible author of an anonymous email from a group of potential suspects. Most previous contributions employed a traditional classification approach, such as decision tree and Support Vector Machine (SVM), to identify the author and studied the effects of different writing style features on the classification accuracy. However, little attention has been given on ensuring the quality of the evidence. In this work, we introduce an innovative data mining method to capture the write-print of every suspect and model it as combinations of features that occur frequently in the suspect's emails. This notion is called frequent pattern, which has proven to be effective in many data mining applications, but has not been applied to the problem of authorship attribution. Unlike traditional approaches, the extracted write-print by our method is unique among the suspects and, therefore, provides convincing and credible evidence for presenting it in a court of law. Experiments on real-life emails suggest that the proposed method can effectively identify the author and the results are supported by a strong evidence.
-
-
-
Msr3e: Distributed Logic Programming For Decentralized Ensembles
Authors: Edmund Lam and Iliano CervesatoIn recent years, we have seen many advances in distributed systems, in the form of cloud computing and distributed embedded mobile devices, drawing more research interest into better ways to harness and coordinate the combined power of distributed computation. While this has made distributed computing resources more readily accessible to main-stream audiences, the fact remains that implementing distributed software and applications that can exploit such resources via traditional distributed programming methodologies is an extremely difficult task. As such, finding effective means of programming distributed systems is more than ever an active and fruitful research and development endeavor. Our work here centres on the development of a programming language known as MSR3e, designed for implementing highly orchestrated communication behaviors of an ensemble of computing nodes. Computing nodes are either traditional main-stream computer architectures or mobile computing devices. This programming language is based on logic programming, and is declarative and concurrent. It is declarative in that it allows the programmer to express the logic of synchronization between computing nodes without describing any form of control flow. It is concurrent in that its operational semantics is based on a concurrent programming model known as multiset rewriting. The result is a highly expressive distributed programming language that provides a programmer with a high-level abstraction to implement highly complex communication behavior between computing nodes. This allows the programmer to focus on specifying what processes need to synchronize between the computing nodes, rather than how to implement the synchronization routines. MSR3e is based on a traditional multiset rewriting model with two important extensions: (1) Explicit localization of predicates, allowing the programmer to explicitly reference the locations of predicates as a first-class construct of the language (2) Comprehension patterns, providing the programmer a concise mean of writing synchronization patterns that matches dynamically sized sets of data. This method of programming often result to more concise codes (relative to main-stream programming methodologies) that are more human readable and easier to debug. Its close foundation to logic programming also suggests the possibilities of effective automated verification of MSR3e programs. We have currently implemented a prototype of MSR3e. This prototype is a trans-compiler that compiles a MSR3e program into two possible outputs: (1) a C++ program that utilizes the MPI libraries, intended for execution on traditional main-stream computer architectures (e.g., ×86, etc..) or (2) a Java program that utilizes WiFi direct libraries of the android SDK, intended for execution on android mobile devices. We have conducted preliminary experimentations on a small set of examples, to show that MSR3e works in practice. In future, we intend to refine our implementation of MSR3e, scaling up the experiment suites, as well as developing more non-trivial applications in MSR3e, as further proof of concept.
-
-
-
Modelling The Power Produced By Photovoltaic Systems
Authors: Fotis Mavromatakis, Yannis Franghiadakis and Frank VignolaThe development and improvement of a model that can provide accurate estimates of the power produced by a photovoltaic system is useful for several reasons. A reliable model contributes to the proper operation of a photovoltaic power system since any deviations between modeled and experimental power can be flagged and studied for possible problems that can be identified and addressed. It is also useful to grid operators to know hours or a day ahead the contribution from different PV systems or renewable energy systems in general. In this way, they will be able to manage and balance production and demand. The model was designed to use the smallest number of free parameters. Apart from the incoming irradiance and module temperature, the model takes into account the effects introduced by the instantaneous angle of incidence and the air mass. The air mass is related to the position of the sun during its apparent motion across the sky since light travels through an increasing amount of atmosphere as the sun gets lower in the sky. In addition, the model takes into account the reduction in efficiency at low solar irradiance conditions. The model is versatile and can incorporate a fixed or variable percentage for the losses due to the deviation of MPPT tracking from ideal, the losses due to the mismatch of the modules, soiling, aging, wiring losses and the deviation from the nameplate rating. Angle of incidence effects were studied experimentally around solar noon by rotating the PV module at predetermined positions and recording all necessary variables (beam & global irradiances, module temperature and short circuit current, sun and module coordinates). Air mass effects were studied from sunrise to solar noon with the PV module always normal to the solar rays (global irradiance, temperature and short circuit current were recorded). Stainless steel meshes were used to artificially reduce the level of the incoming solar irradiance. A pyranometer and a reference cell were placed behind the mesh, while the unobstructed solar irradiance was monitored with a second reference cell. The different mesh combinations allowed us to reach quite low levels of irradiance (10%) with respect to the unobstructed irradiance (100%). Seasonal dust effects were studied by comparing the transmittance of glass samples exposed to outdoor conditions, at weekly time intervals, against a cleaned one. Data from several different US sites as well as from PV systems located in Crete, Greece are currently used to validate the model. Instantaneous values as well as daily integrals are compared to check the performance of the model. At this stage of analysis, it turns out that the typical accuracy of the model is better than 10% for angles of incidence less than sixty degrees. In addition, the performance of the model as a function of the various parameters is being studied and how these affect the calculations. In addition to the functions that have been determined from our measurements, functions available in the literature are also being tested.
-
-
-
A New Structural View Of The Holy Book Based On Specific Words: Towards Unique Chapters (surat) And Sentences (ayat) Characterization In The Quran
Authors: Meshaal Al-saffar, Ali Mohamed Jaoua, Abdelaali Hassaine and Samir ElloumiIn the context of web Islamic data analysis and authentication an important task is to be able to authenticate the holy book if published in the net. For that purpose, in order to detect texts contained in the holy book, it seems obvious to first characterize words which are specific to existing chapters (i.e. "Sourat") and words characterizing each sentence in any chapter (i.e. "Aya"). In this current research, we have first mapped the text of the Quran to a binary context R linking each chapter to all words contained in it, and by calculating the fringe relation F of R, we have been able to discover in a very short time all specific words in each chapter of the holy book. By applying the same approach we have found all specific words of each sentence (i.e. "Aya") in the same chapter whenever it is possible. We have found that almost all sentences in the same chapter have one or many specific words. Only sentences repeated in the same chapter or those sentences included in each other might not have specific words. Observation of words simultaneously specific to a chapter in the holy book and to the sentence in the same chapter gave us the idea for characterizing all specific sentences in each chapter with respect to the whole Quran. We found that for 42 chapters all specific words of a chapter are also specific of some sentence in the same chapter. Such specific words might be used to detect in a shorter time website containing some part of the Quran and therefore should help for checking their authenticity. As a matter of fact by goggling only two or three specific words of a chapter, we observed that search results are directly related to the corresponding chapter in the Quran. Al results have been obtained for Arabic texts with or without vowels. Utilization of adequate data structures and threads enabled us to have efficient software written in Java language. The present tool is directly useful for the recognition of different texts in any domain. In the context of our current project, we project to use the same methods to characterize Islamic books in general. ACKNOWLEDGMENT: This publication was made possible by a grant from the Qatar National Research Fund through National Priority Research Program (NPRP) No. 06-1220-1-233. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Qatar National Research Fund or Qatar University.
-
-
-
A Novel Approach To Detection Of Glandular Structures In Colorectal Cancer Histology Images
Authors: Korsuk Sirinukunwattana, David Snead and Nasir RajpootBackground: Gland is a prevalent organ in a human body, synthesizing hormones and other vital substances. Gland morphology is an important feature in diagnosing malignancy and assessing the tumor grade in colorectal adenocarcinomas. However, a good detection and segmentation of glands is required prior to the extraction of any morphological features. Objectives: The aim of this work is to generate a glandular map for a histopathological image containing glandular structures. The map indicates the likelihood of different image regions belonging to glandular structures. This information can then be used as a clue for initial detection of glands. Methods: A pipeline to generate the probability map consists of the following steps. First, a statistical region merging algorithm is employed to generate superpixels. Second, texture and color features are extracted from each superpixel. For texture features, we calculate the coefficients of scattering trans- form. This transformation produces features at different scale-spaces which are translation-invariant and Lipschitz stable to deformation. To summarize the relationship across different scale-spaces, a region-covariance descriptor, which is a symmetric positive definite (SPD) matrix, is calculated. We call this image descriptor, scattering SPD. For color features, we quantize colors in all training images to reduce the number of features and to reduce the effect of stain variation between different images. Color information is encoded by a normalized histogram. Finally, we train a decision tree classifier to recognize superpixels belonging to glandular and nonglandular structures, and assign the probability of a superpixel belonging to the glandular class. Results: We tested our algorithm on a benchmark dataset consisting of 72 images of Hematoxylin & Eosin (H&E) stained colon biopsy from 36 patients. The images were captured at 20× magnification and the expert annotation is provided. One third of the images were used for training and the remaining for testing. Pixels with a probability value greater than 0.5 were considered as the detected glands. Table 1 shows that, in terms of the Dice index, the proposed method performs 5% better than local binary patterns and the combination between scattering SPD and color histogram results in 25% better accuracy than the baseline. Table 1: Average Segmentation Performance ApproachesSensitivitySpecificityAccuracyDice?Farjam et al. (baseline)0.50 ± 0.130.80 ± 0.150.62 ± 0.090.59 ± 0.14 superpixels + local binary pattern0.77 ± 0.060.67 ± 0.100.73 ± 0.040.77 ± 0.05 superpixels + scattering SPD0.77 ± 0.070.85 ± 0.090.81 ± 0.060.82± 0.06 superpixels + color histogram0.74 ± 0.220.82 ± 0.170.77 ± 0.10 0.79 ± 0.10 superpixels + scattering SPD + color histogram0.78 ± 0.07 0.88 ± 0.070.82± 0.060.84 ± 0.06 Conclusions: We present a superpixel-based approach for glandular structure detection in colorectal cancer histology images. We also present a novel texture descriptor derived from the region covariance matrix of scattering coefficients. Our approach generates highly promising results for initial detection of glandular structures in colorectal cancer histology images.
-
-
-
City-wide Traffic Congestion Prediction In Road Networks
Authors: Iman Elghandour and Mohamed KhalefaTraffic congestion is a major problem in many big cities around the world. According to a study performed by the world bank in Egypt in 2010 and concluded in 2012, the traffic congestion was estimated to 14 Billion EGP in the Cairo metropolitan area and to 50 Billion EGP (4\% of the GDP) in the entire Egypt. Few of the reasons of the high monetary cost of the traffic congestion are: (1) travel time delay, (2) travel time unreliability, and (3) excess fuel consumption. Smart traffic management addresses some of the causes and consequences of traffic congestion. It can predict congested routes, take preventive decisions to reduce congestion, disseminate information about accidents and work zones, and identify the alternate routes that can be taken. In this project, we develop a real-time and scalable data storage and analysis framework for traffic prediction and management. The input to this system is a stream of GPS and/or cellular data that has been cleaned and mapped to the road network. Our proposed framework allows us to (1) predict the roads that will suffer from traffic congestion in the near future, and traffic management decisions that can relieve this congestion; and (2) a what-if traffic system that is used to simulate what will happen if a traffic management or planning decision is taken. For example, it answers questions, such as: "What will happen if an additional ring road is built to surround Cairo?" or "What will happen if point of interest X is moved away from the downtown to the outskirts of the city. This framework has the following three characteristics. First, it predicts the flow of the vehicles in the road based on historical data. This is done by tracking vehicles every day trajectories and using them in a statistical model to predict the vehicles movement on the road. It then predicts the congested traffic zones based on the current vehicles in the road and their predicted paths. Second, historical traffic data are heavily exploited in the approach we use to predict traffic flow and traffic congestion. Therefore, we develop new techniques to efficiently store traffic data in the form of graphs for fast retrieval. Third, it is required to update the traffic flow of vehicles and predict congested areas in real-time, therefore we deploy our framework in the cloud and employ optimization technique to speedup the execution of our algorithms.
-
-
-
Integration Of Solar Generated Electricity Into Interconnected Microgrids Modeled As Partitioning Of Graphs With Supply And Demand In A Stochastic Environment
Authors: Raka Jovanovic and Abdelkader BousselhamA significant research effort has been dedicated in developing smartgrids in the form of interconnected microgrids. Their use is especially suitable for integration of solar generated electricity, due to the fact that by separating the electrical grid into smaller subsections, the fluctuations in the voltage and frequency that occur, can be to, a certain extent, isolated from the main grid. For the new topology, it is essential to optimize several important properties like the self-adequacy, reliability, supply-security and the potential for self-healing. These problems are frequently hard to solve, in the sense that they are hard combinatorial ones for which no polynomial time algorithm exists that can find the desired optimal solutions. Due to this fact research has been directed in finding approximate solutions, using different heuristic and metaheuristic methods. Another issue is that such systems are generally of a gigantic size. This resulted in two types of models, detailed ones that are applied to small systems and simplified ones for large ones. In the case of the former, graph models have shown to be very suitable especially ones that are based on graph partitioning problems[4]. One of the questions with the majority of previously developed graph models for large scales systems, is that they are deterministic. They are used for modeling an electrical grid which is in essence a stochastic system. In this work we focus on developing a stochastic graph model for including solar generated electricity to a system of interconnected microgrids. More precisely we focus on maximizing the self-adequacy of the individual microgrids, while trying to maximize the level of included solar generated energy, with a minimal amount of necessary energy storage. In our model we include the unpredictability of the generated electricity, and under such circumstances maintain a high probability that all demands in the system are satisfied. In practice we adapt and extend the concept of partitioning graphs with supply and demand for the problem of interest. This is done by having multiple values corresponding to the demand for one node in the graph. These values are used to represent energy usage in different time periods in one day. In a similar fashion we introduce a probability for the amount of electrical energy that will be produced by the generating nodes, and the maximal amount of storage in such nodes. Finally, we also include a heuristic approach to optimize this multi-objective optimization problem.
-
-
-
Empower The Vle With Social Computing Tools: System Prototype
Authors: Khaled Hussein, Jassim Al-jaber and Yusuf ArayiciSome lecturers gave reports and showed instances of moving part or all of their electronic course support from the Virtual Learning Environment (VLE) to social networking systems like Youtube, MySpace and Facebook because of greater student engagement with these kinds of social networking tools. Recent student interviews in Aspire Academy in Qatar have revealed that students are not concerned with what they are taught (e.g. through lectures, seminars, distance learning sessions, or through a blended learning approach) so long as the instruction was good. The latter reason opens great opportunities for delivering courses through SC media over the VLE, but also raises the question: To what extent can VLE and Social Media be leveraged as a good practice in learning and teaching in different modalities? In this research, the new experience of enriching the VLE with SC tools in Aspire Academy is presented through developing a new system prototype as a more effective solution. The prototyping process included usability testing with Aspire student-athletes and lecturers, plus heuristic evaluation by Human Computer Interaction (HCI) experts. Implementing the prototype system in academic institutions is expected to develop better learning levels and consequently better educational outcomes.
-
-
-
Gate Simulation Of A Clinical Pet Using The Computing Grid
Authors: Yassine Toufique and Othmane BouhaliNowadays, Nuclear Medicine becomes a potential research field of a growing importance at the universities. This fact can be explained, on one hand, by the increasing number of purchases in medical imaging devices by the hospitals, and, in other hand, by the number of PhD students and researchers becoming interested to medical studies. A Positron Emission Tomography (PET) system is a functional medical imaging technique which provides 3D images of the living processes inside the body relying on radioisotopes usage. The physics of PET systems is based on the detection in coincidence of the two 511 keV ?-rays, produced by an electron-positron annihilation, and emitted in opposite directions, as dictated by the conservation of energy and momentum physics laws. The radioactive nuclei used as sources of emission of positrons for PET systems are mainly 11C, 13N, 15O, and 18F, which are produced in cyclotrons, and decay with half-lives of 20.3 min, 9.97 min, 124 sec, and 110 min, respectively. These radioisotopes can be incorporated in a wide variety of radiopharmaceuticals that are inhaled or injected, leading to a medical diagnosis based on images obtained from a PET system. The PET scanners consists mainly of a large number of detector crystals arranged in a ring which surround the patient organ (or phantom in simulations) where the radioisotope tracer (e.g.: 18F-FDG) is inoculated. The final 3D image, representing the distribution of the radiotracer in the organ (or the phantom), is obtained by processing the signals delivered by the detectors of the scanner (when the ?-rays emitted from the source interact with the crystals) and using image reconstruction algorithms. This allows measuring important body functions, such as blood flow, oxygen use, and glucose metabolism, to help doctors evaluate how well organs and tissues are functioning and to diagnose and determine the severity of or treat a variety of diseases. The simulation of a real experiment using a GATE-modeled clinical Positron Emission Tomography (PET) scanner, namely PHILIPS Allegro, has been carried out using a computing Grid infrastructure. In order to reduce the computing time, the PET simulation tasks are split into several jobs submitted to the Grid to run simultaneously. The splitting technique and merging the outputs are discussed. Results of the simulation are presented and good agreements are observed with experimental data. Keywords—Grid Computing; Monte Carlo simulation; GATE; Positron Emission Tomography; splitting
-
-
-
A Cross-platform Benchmark Framework For Mobile Semantic Web Reasoning Engines In Clinical Decision Support Systems
Authors: William Van Woensel, Newres Al Haider and Syed Sr AbidiBackground & Objectives Semantic Web technologies are used extensively in the health domain to enable expressive, standards-based reasoning. Deploying Semantic Web reasoning processes directly on mobile devices has a number of advantages, including robustness to connectivity loss and more timely results. By leveraging local reasoning processes, Clinical Decision Support Systems (CDSS) can thus present timely alerts given dangerous health issues, even when connectivity is lacking. However, a number of challenges arise as well, related to mobile platform heterogeneity and limited computing resources. To tackle these challenges, developers should be empowered to benchmark mobile reasoning performance across different mobile platforms, with rule- and datasets of varying scale and complexity, and under typical CDSS reasoning process flows. To deal with the current heterogeneity of rule formats, a uniform interface on top of mobile reasoning engines also needs to be provided. System We present a mobile, cross-platform benchmark framework, comprising two main components: 1) a generic Semantic Web layer, supplying a uniform, standards-based rule- and dataset interface to mobile reasoning engines; and 2) a Benchmark Engine, to investigate mobile reasoning performance. This framework was implemented using the PhoneGap cross-platform development tool, allowing it to be deployed on a range of mobile platforms. During benchmark execution, the benchmark rule- and dataset (encoded using the SPARQL Inferencing Notation (SPIN) and Resource Description Framework (RDF)) are first passed to the generic Semantic Web layer. In this layer, the local Proxy component contacts an external Conversion Web Service, where converters perform conversion into the different rule engine formats. Developers may develop new converters to support other engines. The results are then communicated back to the Proxy and passed on to the local Benchmark Engine. In the Benchmark Engine, reasoning can be conducted using different process flows, to better align the benchmarks with real-world CDSS. To plugin new reasoning engines (JavaScript or native), developers need to implement a plugin realizing a uniform interface (e.g., load data, execute rules). New process flows can also be supplied. In the benchmarks, data and rule loading times, as well as reasoning times, are measured. From our work in clinical decision support, we identified two useful reasoning process flows: * Frequent Reasoning: To infer new facts, the reasoning engine is loaded with the entire datastore each time a certain timespan has elapsed, and the relevant ruleset is executed. * Incremental Reasoning: In this case, the datastore is kept in-memory, whereby reasoning is applied each time a new fact has been added. Currently, 4 reasoning engines (and their custom formats) are supported, including RDFQuery (https://code.google.com/p/rdfquery/wiki/RdfPlugin), RDFStore-JS (http://github.com/antoniogarrote/rdfstore-js), Nools (https://github.com/C2FO/nools) and AndroJena (http://code.google.com/p/androjena/). Conclusion In this paper, we introduced a mobile, cross-platform and extensible benchmark framework for comparing mobile Semantic Web reasoning performance. Future work consists of investigating techniques to optimize mobile reasoning processes.
-
-
-
Automatic Category Detection Of Islamic Content On The Internet Using Hyper Concept Keyword Extraction And Random Forest Classification
Authors: Abdelaali Hassaine and Ali JaouaThe classification of Islamic content on the Internet is a very important step towards authenticity verification. Many Muslims complain that the information they get from the Internet is either inaccurate or simply wrong. With the content growing in an exponential way, its manual labeling and verification is simply an impossible task. To the extent of our knowledge, no previous work has been carried out regarding his task. In this study, we propose a new method for automatic classification of Islamic content on the Internet. A dataset of four Islamic groups has been created containing texts from four different Islamic groups, namely: Sunni (Content representing Sunni Islam), Shia (Content representing Shia Islam), Madkhali (Content forbidding politics and warning against all scholars with different views) and Jihadi (Content promoting Jihad). We collected a dataset containing 20 different texts for each of those groups, totalizing 80 texts, out of which 56 are used for training and 24 for testing. In order to classify those contents automatically, we first preprocessed the texts using normalization, stop words removal, stemming and segmentation into words. Then, we used the hyper-concepts method which makes it possible to represent any corpus through a relation and to decompose it into non-overlapping rectangular relations and to highlight the most representative attributes or keywords in a hierarchical way. The hyper concept keywords extracted from the training set are subsequently used as predictors (containing either 1 when the text contains the keyword and 0 otherwise). Those predictors are fed to a random forest classifier of 5000 random trees. The number of extracted keywords varies according to the depth of the hyper concept tree, ranging from 47 keywords (depth 1) to 296 keywords (depth 15). The average classification accuracy starts at 45.79% for depth 1 and remains roughly stable at 68.33% from depth 10. This result is very interesting as there four different classes (a random predictor would therefore score around 25%). This study is a great step towards the automatic classification of Islamic content on the Internet. The results show that the hyper concept method successfully extracts relevant keywords for each group and helps in categorizing them automatically. The method needs to be combined with some semantic method in order to reach even higher classification rates. The results of the method are also to be compared with manual classification in order to foresee the improvement one can expect as some texts might indifferently belong to more than one category.
-
-
-
Optimal Communication For Sources And Channels With Memory And Delay-sensitive Applications
More LessShannon's theory of information was developed to address the fundamental problems of communication, such as the reliable data transmission over a noisy channel and the optimal data compression. During the years, it has expanded to find wide range of applications in many areas ranging from cryptography and cyber security to economics and genetics. Recent technological advances designate information theory as a promising and elegant tool to analyze and model information structures within living organisms. The key characteristics of data transmission within organisms are that they consider sources and channels with memory and feedback, they handle their information in a fascinating Shannon-optimum way, while the transmission of the data is delayless. Despite the extensive literature on memoryless sources and channels, the literature regarding sources and channels with memory is limited. Moreover, the optimality of communication schemes for these general sources and channels is completely unexplored. Optimality is often addressed via Joint Source Channel Coding (JSCC) and it is achieved if there exists an encoder-decoder scheme such that the Rate Distortion Function (RDF) of the source is equal to the capacity of the channel. This work is motivated by neurobiological data transmission and aims to design and analyze optimal communication systems consisting of channels and sources with memory, within a delay-sensitive environment. To this aim, we calculate the capacity of the given channel with memory and match it to a Markovian source via an encoder-decoder scheme, utilizing concepts from information theory and stochastic control theory. The most striking result to emerge from this research is that optimal and delayless communication for sources and channels with memory is not only feasible, but also it is achieved with the minimum complexity and computational cost. Though the current research is stimulated by a neurobiological application, the proposed approach and methodology as well as the provided results deliver several noteworthy contributions to a plethora of applications. These, among others, include delay sensitive and real time communication systems, control-communication applications and sensor networks. It addresses issues such as causality, power efficiency, complexity and security, extends the current knowledge of channels and source with memory, while it contributes to the inconclusive debates of real time communication and uncoded data transmission.
-