Child Monitoring System App

Android Mobile development
Child Monitoring System App Solution for missing children with the help of GPS technologies. The application uses two main services that is GPS and telephonic services. For location services is GPS and telephony services is SMS, call logs and contacts. Android is a widely used OS used by a lot of masses globally. Internet is used for communicating between children and parent side. The System can be designed in a simple way. The application developed to make user-friendly approach on both sides. The parents and children both should have GPS Based smart phones. The application is used to track the Child’s location as well as call logs, messages and contact from their smartphone. Reason for choosing android OS is that to target more users.
Read More

Railway Ticket Booking System Using Qr Code

Android Mobile development, Web | Desktop Application
Railway Ticket Booking System Using Qr Code This project deals with the development and implementation of smart phone application which is more effective and simple than current ticketing system. The “Ticket Checker System” can be bought easily anytime, anywhere and ticket will be present in the customer’s phone in the form of “Quick Response Code”. GPS facility is used for validation of the ticket at the source and deletion at the destination. The information for each user is stored in a SQL database for security purpose which is unavailable in the current suburban railway system. Also the ticket checker is provided with an application to search for the user’s ticket with the ticket number in the cloud database for checking purposes.
Read More

Mobile Based Attendance System

Android Mobile development, Web | Desktop Application
Mobile Based Attendance System built to eliminate the time and effort wasted in taking attendances in schools and colleges. It also greatly reduces the amount of paper resources needed in attendance data management.This is an android mobile app. It’s built to be used for school/college faculty so that they may take student attendance on their phones. The system is divided into following modules:Student Attendance List Creation: Once this App is installed on a phone, a it allows user to create a student attendance sheet consisting of name, roll number, date, Absent/Present mark and subject. He has to fill student names along with associated roll numbers. Attendance Marking: The faculty has the list on his phone now. He may see the list call roll numbers and select absent id the student…
Read More

Optimization of Horizontal Aggregation in SQL by Using K-Means Clustering.

Cloud Computing, Data mining
optimization of Horizontal Aggregation in SQL by Using K-Means Clustering. To analyze data efficiently, Data mining systems are widely using datasets with columns in horizontal tabular layout. Preparing a data set is more complex task in a data mining project, requires many SQL queries, joining tables and aggregating columns. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout returns set of numbers, instead of one number per row. The system uses one parent table and different child tables, operations are then performed on the data loaded from multiple tables. PIVOT operator, offered by RDBMS is used to calculate aggregate operations. PIVOT method is much faster method and offers much scalability. Partitioning large set of data, obtained from the result of horizontal aggregation, in to…
Read More

Interpreting the Public Sentiment Variations on Twitter

Data mining
Interpreting the Public Sentiment Variations on Twitter Millions of users share their opinions on Twitter, making it a valuable platform for tracking and analyzing public sentiment. Such tracking and analysis can provide critical information for decision making in various domains. Therefore it has attracted attention in both academia and industry. Previous research mainly focused on modeling and tracking public sentiment. In this work, we move one step further to interpret sentiment variations. We observed that emerging topics(named foreground topics) within the sentiment variation periods are highly related to the genuine reasons behind the variations. Based on this observation, we propose a Latent Dirichlet Allocation (LDA) based model, Foreground and Background LDA (FB-LDA), to distill foreground topics and filter out lngstanding background topics. These foreground topics can give potential interpretations of…
Read More

Product Aspect Ranking and Its Applications

Data mining
Product Aspect Ranking and Its Applications Numerous consumer reviews of products are now available on the Internet. Consumer reviews contain rich and valuable knowledge for both firms and users.However,the reviews are often disorganized, leading to difficulties in information navigation and knowledge acquisition. This article proposes a product aspect ranking framework, which automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability of the numerous reviews. The important product aspects are identified based on two observations: 1) the important aspects are usually commented on by a large number of consumers and 2) consumer opinions on the important aspects greatly influence their overall opinions on the product. In particular, given the consumer reviews of a product, we first identify product aspects by a shallow dependency parser…
Read More

Supporting Privacy Protection in Personalized Web Search

Data mining, Web | Desktop Application
Supporting Privacy Protection in Personalized Web Search Personalized web search (PWS) has demonstrated its effectiveness in improving the quality of various search services on the Internet. However, evidences show that users’ reluctance to disclose their private information during search has become a major barrier for the wide proliferation of PWS. We study privacy protection in PWS applications that model user preferences as hierarchical user profiles. We propose a PWS framework called UPS that can adaptively generalize profiles by queries while respecting user specified privacy requirements. Our runtime generalization aims at striking a balance between two predictive metrics that evaluate the utility of personalization and the privacy risk of exposing the generalized profile. We present two greedy algorithms, namely GreedyDP and GreedyIL, for runtime generalization. We also provide an online prediction…
Read More

Set Predicates in SQL: Enabling Set- Level Comparisons for Dynamically Formed Groups

Data mining, Web | Desktop Application
Set Predicates in SQL: Enabling Set- Level Comparisons for Dynamically Formed Groups In data warehousing and OLAP applications, scalar level predicates in SQL become increasingly inadequate to support a class of operations that require set-level comparison semantics, i.e., comparing a group of tuples with multiple values. Currently, complex SQL queries composed by scalar-level operations are often formed to obtain even very simple set-level semantics. Such queries are not only difficult to write but also challenging for a database engine to optimize, thus can result in costly evaluation. This paper proposes to augment SQL with set predicate, to bring out otherwise obscured set-level semantics. We studied two approaches to processing set predicates—an aggregate function-based approach and a bitmap index-based approach. Moreover, we designed a histogram-based probabilistic method of set predicate selectivity…
Read More

RRW—A Robust and Reversible Watermarking Technique for Relational Data

Data mining, Web | Desktop Application
RRW—A Robust and Reversible Watermarking Technique for Relational Data Advancement in information technology is playing an increasing role in the use of information systems comprising relational databases. These databases are used effectively in collaborative environments for information extraction; consequently, they are vulnerable to security threats concerning ownership rights and data tampering. Watermarking is advocated to enforce ownership rights over shared relational data and for providing a means for tackling data tampering. When ownership rights are enforced using watermarking, the underlying data undergoes certain modifications; as a result of which, the data quality gets compromised. Reversible watermarking is employed to ensure data quality along-with data recovery. However, such techniques are usually not robust against malicious attacks and do not provide any mechanism to selectively watermark a particular attribute by taking into account its role in knowledge discovery. Therefore, reversible watermarking is required that ensures; (i) watermark encoding and decoding by…
Read More

Access Control Mechanisms for Outsourced Data in Cloud

Cloud Computing, Web | Desktop Application
Access Control Mechanisms for Outsourced Data in Cloud Traditional access control models often assume that the en- tity enforcing access control policies is also the owner of data and re- sources. This assumption no longer holds when data is outsourced to a third-party storage provider, such as the cloud. Existing access control solutions mainly focus on preserving con dentiality of stored data from unauthorized access and the storage provider. However, in this setting, access control policies as well as users' access patterns also become pri- vacy sensitive information that should be protected from the cloud. We propose a two-level access control scheme that combines coarse-grained access control enforced at the cloud, which allows to get acceptable com- munication overhead and at the same time limits the information that the cloud learns…
Read More

Balancing Performance, Accuracy, and Precision for Secure Cloud Transactions

Cloud Computing, Web | Desktop Application
Balancing Performance, Accuracy, and Precision for Secure Cloud Transactions In distributed transactional database systems deployed over cloud servers, entities cooperate to form proofs of authorizations that are justified by collections of certified credentials. These proofs and credentials may be evaluated and collected over extended time periods under the risk of having the underlying authorization policies or the user credentials being in inconsistent states. It therefore becomes possible for policy-based authorization systems to make unsafe decisions that might threaten sensitive resources. In this paper, we highlight the criticality of the problem. We then define the notion of trusted transactions when dealing with proofs of authorization. Accordingly, we propose several increasingly stringent levels of policy consistency constraints, and present different enforcement approaches to guarantee the trustworthiness of transactions executing on cloud servers. We propose a Two-Phase Validation Commit protocol…
Read More

Typicality-Based Collaborative Filtering Recommendation

Cloud Computing, Data mining, Security and Encryption
Typicality-Based Collaborative Filtering Recommendation Collaborative filtering (CF) is an important and popular technology for recommender systems. However, current CF methods suffer from such problems as data sparsity, recommendation inaccuracy, and big-error in predictions. In this paper, we borrow ideas of object typicality from cognitive psychology and propose a novel typicality-based collaborative filtering recommendation method named TyCo. A distinct feature of typicality-based CF is that it finds “neighbors” of users based on user typicality degrees in user groups (instead of the corated items of users, or common users of items, as in traditional CF). To the best of our knowledge, there has been no prior work on investigating CF recommendation by combining object typicality. TyCo outperforms many CF recommendation methods on recommendation accuracy (in terms of MAE) with an improvement of…
Read More

A Location- and Diversity-aware News Feed System for Mobile Users

Android Mobile development, Security and Encryption
A Location- and Diversity-aware News Feed System for Mobile Users A location-aware news feed system enables mobile users to share geo-tagged user-generated messages, e.g., a user can receive nearby messages that are the most relevant to her. In this paper, we present MobiFeed that is a framework designed for scheduling news feeds for mobile users. MobiFeed consists of three key functions, location prediction, relevance measure, and news feed scheduler. The location prediction function is designed to predict a mobile user’s locations based on an existing path prediction algorithm. The relevance measure function is implemented by combining the vector space model with non-spatial and spatial factors to determine the relevance of a message to a user. The news feed scheduler works with the other two functions to generate news feeds for…
Read More

Design of a Secured E-voting System

Android Mobile development
Design of a Secured E-voting System E-voting systems are becoming popular with the widespread use of computers and embedded systems. Security is the vital issue should be considered in such systems. This paper proposes a new e-voting system that fulfills the security requirements of e-voting. It is based on homomorphic property and blind signature scheme. The proposed system is implemented on an embedded system which serves as a voting machine. The system employes RFID to store all conditions that comply with the rule of the government to check voter eligibility.
Read More

Shopping Application System With Near Field Communication (NFC) Based on Android

Android Mobile development, Web | Desktop Application
Shopping Application System With Near Field Communication (NFC) Based on Android The rapid development of mobile communications systems today, along with the changing times and technology, both in terms of hardware, operating system used and the use of Internet bandwidth, making some mobile applications also contribute to exploit these developments. Mobile Commerce Applications for an example, became the most popular applications for mobile users who do not want to trouble yourself with having to carry cash everywhere. An important technology behind mobile payments is called Near Field Communication (NFC). As an indication that NFC represents the potential and tremendous business, leading companies such as Nokia, Microsoft and NXP Semiconductors, actively engaged in the NFC Forum. Shopping application process integrated with NFC technologybased on Android. Shopping application system designed, for the…
Read More

Developing an Android based Learning Application for Mobile Devices

Android Mobile development
Developing an Android based Learning Application for Mobile Devices his paper is about the development of MLEA, a platform that assists, through Android cellphones and tablets, the mobility of users of learning virtual environments. MLEA is an application that implements computational techniques such as web services, design patterns, ontologies, and mobile computational techniques in order to allow the communication between mobile devices and the content management system – Moodle. It´s based on a service oriented, client server architecture that combines the REST protocol and JSON format for data interchange. The client will be provided with features for alerts, file downloads, chats and forums, grade books, quizzes, and calendar, among others.
Read More

Location Based Reminder Using GPS For Mobile

Android Mobile development
Location Based Reminder Using GPS For Mobile Although location-based reminder applications have been widely prototyped, there are few results regarding their impact on people: how are they used, do they change people’s behavior and what features influence usefulness the most. Cell phones provide a compelling platform for the delivery of location-based reminders within a user's everyday natural context. We present requirements for location-based reminders resulting from a qualitative study performed at small area in the city, and elaborate how these results are influencing ongoing design of a more comprehensive location-based reminder system. In this paper we propose an architecture of location based services which uses GPS. Within the architecture, we discuss the challenges for context management, service trigger mechanism and preference-based services.
Read More

OCRAndroid: A Framework to Digitize Text Using Mobile Phones

Android Mobile development
OCRAndroid: A Framework to Digitize Text Using Mobile Phones As demand grows for mobile phone applications, research in optical character recognition, a technology well developed for scanned documents, is shifting focus to the recognition of text embedded in digital photographs. In this paper, we present OCRdroid, a generic framework for developing OCR-based applications on mobile phones. OCRdroid combines a light-weight image preprocessing suite installed inside the mobile phone and an OCR engine connected to a backend server. We demonstrate the power and functionality of this framework by implementing two applications called PocketPal and PocketReader based on OCRdroid on HTC Android G1 mobile phone. Initial evaluations of these pilot experiments demonstrate the potential of using OCRdroid framework for realworld OCR-based mobile applications.
Read More

Android based elimination of potholes

Android Mobile development, Web | Desktop Application
Android based elimination of potholes Its a web based project where user or normal residential people can complain about their nearby potholes. They can take an image of it and upload it to submit to BMC department. Every user will have their own credentials to login and to view the potholes. BMC will have their own admin login details to look after the posting and tackle or reply to each complaints so that they can sort those problem as soon as possible.
Read More

Secure Authentication using Image Processing and Visual Cryptography for Banking Applications

Image Processing, Web | Desktop Application
Secure Authentication using Image Processing and Visual Cryptography for Banking Applications Core banking is a set of services provided by a group of networked bank branches. Bank customers may access their funds and perform other simple transactions from any of the member branch ofces. The major issue in core banking is the authenticity of the customer. Due to unavoidable hacking of the databases on the internet, it is always quite difcult to trust the information on the internet. To solve this problem of authentication, we are proposing an algorithm based on image processing and visual cryptography. This paper proposes a technique of processing the signature of a customer and then dividing it into shares. Total number of shares to be created is depending on the scheme chosen by the bank.…
Read More

An Algorithm to Automatically Generate Schedule for School Lectures Using a Heuristic Approach

Web | Desktop Application
An Algorithm to Automatically Generate Schedule for School Lectures Using a Heuristic Approach This paper proposes a general solution for the School timetabling problem. Most heuristic proposed earlier approaches the problem from the students’ point of view. This solution, however, works from the teachers’ point of view i.e. teacher availability for a given time slot. While all the hard constraints (e.g. the availability of teachers, etc.) are resolved rigorously, the scheduling solution presented in this paper is an adaptive one, with a primary aim to solve the issue of clashes of lectures and subjects, pertaining to teachers.
Read More

A Mixed Reality Virtual Clothes Try-On System

Web | Desktop Application
A Mixed Reality Virtual Clothes Try-On System Virtual try-on of clothes has received much attention recently due to its commercial potential. It can be used for online shopping or intelligent recommendation to narrow down the selections to a few designs and sizes. In this paper, we present a mixed reality system for 3D virtual clothes try-on that enables a user to see herself wearing virtual clothes while looking at a mirror display, without taking off her actual clothes. The user can select various virtual clothes for trying-on. The system physically simulates the selected virtual clothes on the user's body in real-time and the user can see virtual clothes fitting on the her mirror image from various angles as she moves. The major contribution of this paper is that we automatically…
Read More

A Query Formulation Language for the data web

Data mining
A Query Formulation Language for the data web We present a query formulation language called MashQL in order to easily query and fuse structured data on the web. The main novelty of MashQL is that it allows people with limited IT-skills to explore and query one or multiple data sources without prior knowledge about the schema, structure, vocabulary, or any technical details of these sources. More importantly, to be robust and cover most cases in practice, we do not assume that a data source should have -an offline or inline- schema. This poses several language-design and performance complexities that we fundamentally tackle. To illustrate the query formulation power of MashQL, and without loss of generality, we chose the Data Web scenario. We also chose querying RDF, as it is the…
Read More

Efficient and Discovery of Patterns in Sequence Data Sets.

Data mining, Web | Desktop Application
Efficient and Discovery of Patterns in Sequence Data Sets. Existing sequence mining algorithms mostly focus on mining for subsequences. However, a large class of applications, such as biological DNA and protein motif mining, require efficient mining of “approximate” patterns that are contiguous. The few existing algorithms that can be applied to find such contiguous approximate pattern mining have drawbacks like poor scalability, lack of guarantees in finding the pattern, and difficulty in adapting to other applications. In this paper, we present a new algorithm called Flexible and Accurate Motif DEtector (FLAME). FLAME is a flexible suffix-tree-based algorithm that can be used to find frequent patterns with a variety of definitions of motif (pattern) models. It is also accurate, as it always finds the pattern if it exists. Using both real…
Read More

Mining Web Graphs for Recommendations.

Data mining, Web | Desktop Application
Mining Web Graphs for Recommendations. As the exponential explosion of various contents generated on the Web, Recommendation techniques have become increasingly indispensable. Innumerable different kinds of recommendations are made on the Web every day, including music, images, books recommendations, query suggestions, etc. No matter what types of data sources are used for the recommendations, essentially these data sources can be modeled in the form of graphs. In this paper, aiming at providing a general framework on mining Web graphs for recommendations, (1) we first propose a novel diffusion method which propagates similarities between different recommendations; (2) then we illustrate how to generalize different recommendation problems into our graph diffusion framework. The proposed framework can be utilized in many recommendation tasks on the World Wide Web, including query suggestions, image recommendations,…
Read More

Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques

Data mining, Web | Desktop Application
Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques Recommender systems are becoming increasingly important to individual users and businesses for providing personalized recommendations. However, while the majority of algorithms proposed in recommender systems literature have focused on improving recommendation accuracy, other important aspects of recommendation quality, such as the diversity of recommendations, have often been overlooked. In this paper, we introduce and explore a number of item ranking techniques that can generate recommendations that have substantially higher aggregate diversity across all users while maintaining comparable levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed techniques using several real-world rating datasets and different rating prediction algorithms.
Read More

Predicting missing items in shopping cart using fast algorithm

Data mining, Web | Desktop Application
Predicting missing items in shopping cart using fast algorithm Prediction in shopping cart uses partial information about the contents of a shopping cart for the prediction of what else the customer is likely to buy. In order to reduce the rule mining cost, a fast algorithm generating frequent itemsets without generating candidate itemsets is proposed. The algorithm uses Boolean vector with relational AND operation to discover frequent itemsets and generate the association rule. Association rules are used to identify relationships among a set of items in database. Initially Boolean Matrix is generated by transforming the database into Boolean values. The frequent itemsets are generated from the Boolean matrix. Then association rules are to generated from the already generated frequent itemsets. The association rules generated form the basis for prediction. The…
Read More

A Threshold-based Similarity Measure for Duplicate Detection

Data mining, Web | Desktop Application
A Threshold-based Similarity Measure for Duplicate Detection In order to extract beneficial information and recognize a particular pattern from huge data stored in different databases with different formats, data integration is essential. However the problem that arises here is that data integration may lead to duplication. In other words, due to the availability of data in different formats, there might be some records which refer to the same entity. Duplicate detection or record linkage is a technique which is used to detect and match duplicate records which are generated in data integration process. Most approaches concentrated on string similarity measures for comparing records. However, they fail to identify records which share the semantic information. So, in this study, a thresholdbased method which takes into account both string and semantic similarity…
Read More

Efficient Multi-dimensional Fuzzy Search for Personal Information Management Systems

Data mining, Web | Desktop Application
Efficient Multi-dimensional Fuzzy Search for Personal Information Management Systems With the explosion in the amount of semi-structured data users access and store in personal information management systems, there is a critical need for powerful search tools to retrieve often very heterogeneous data in a simple and efficient way. Existing tools typically support some IR-style ranking on the textual part of the query, but only consider structure (e.g., file directory) and metadata (e.g., date, file type) as filtering conditions. We propose a novel multi-dimensional search approach that allows users to perform fuzzy searches for structure and metadata conditions in addition to keyword conditions. Our techniques individually score each dimension and integrate the three dimension scores into a meaningful unified score. We also design indexes and algorithms to efficiently identify the most…
Read More

Enabling Multilevel Trust in Privacy Preserving Data Mining

Data mining, Web | Desktop Application
Enabling Multilevel Trust in Privacy Preserving Data Mining Privacy Preserving Data Mining (PPDM) addresses the problem of developing accurate models about aggregated data without access to precise information in individual data record. A widely studied perturbation-based PPDM approach introduces random perturbation to individual values to preserve privacy before data are published. Previous solutions of this approach are limited in their tacit assumption of single-level trust on data miners. In this work, we relax this assumption and expand the scope of perturbation-based PPDM to Multilevel Trust (MLT-PPDM). In our setting, the more trusted a data miner is, the less perturbed copy of the data it can access. Under this setting, a malicious data miner may have access to differently perturbed copies of the same data through various means, and may combine…
Read More

Advance Mining of Temporal High Utility Itemset

Data mining, Web | Desktop Application
Advance Mining of Temporal High Utility Itemset The stock market domain is a dynamic and unpredictable environment. Traditional techniques, such as fundamental and technical analysis can provide investors with some tools for managing their stocks and predicting their prices. However, these techniques cannot discover all the possible relations between stocks and thus there is a need for a different approach that will provide a deeper kind of analysis. Data mining can be used extensively in the financial markets and help in stock-price forecasting. Therefore, we propose in this paper a portfolio management solution with business intelligence characteristics. We know that the temporal high utility itemsets are the itemsets with support larger than a pre-specified threshold in current time window of data stream. Discovery of temporal high utility itemsets is an…
Read More

A Framework for Personal Mobile Commerce Pattern Mining and Prediction

Data mining, Web | Desktop Application
A Framework for Personal Mobile Commerce Pattern Mining and Prediction In many applications, including location based services, queries may not be precise. In this paper, we study the problem of efficiently computing range aggregates in a multidimensional space when the query location is uncertain. Specifically, for a query point Q whose location is uncertain and a set S of points in a multi- dimensional space, we want to calculate the aggregate (e.g., count, average and sum) over the subset S_ of S such that for each p ∈ S_, Q has at least probability θ within the distance γ to p. We propose novel, efficient techniques to solve the problem following the filtering-and-verification paradigm. In particular, two novel filtering techniques are proposed to effectively and efficiently remove data points from…
Read More

Investigation and Analysis of New Approach of Intelligent Semantic Web Search Engines

Data mining, Web | Desktop Application
Investigation and Analysis of New Approach of Intelligent Semantic Web Search Engines As we know that www is allowing peoples to share the huge information from big database repositories. The amount of information grows billions of databases. Hence to search particular information from these huge databases we need specialized mechanism which helps to retrive that information efficiently. now days various types of search engines are available which makes information retrieving is difficult. but to provide the better solution to this proplem ,semantic web search engines are playing vital role.basically main aim of this kind of search engines is providing the required information is small time with maximum accuracy.
Read More

Clustering Methods in Data Mining with its Applications in High Education

Data mining, Web | Desktop Application
Clustering Methods in Data Mining with its Applications in High Education Data mining is a new technology, developing with database and artificial intelligence. It is a processing procedure of extracting credible, novel, effective and understandable patterns from database. Cluster analysis is an important data mining technique used to find data segmentation and pattern information. By clustering the data, people can obtain the data distribution, observe the character of each cluster, and make further study on particular clusters. In addition, cluster analysis usually acts as the preprocessing of other data mining operations. Therefore, cluster analysis has become a very active research topic in data mining. As the development of data mining, a number of clustering methods have been founded, The study of clustering technique from the perspective of statistics, based on…
Read More

A Novel Algorithm for Automatic Document Clustering

Data mining, Web | Desktop Application
A Novel Algorithm for Automatic Document Clustering Internet has become an indispensible part of today’s life. World Wide Web (WWW) is the largest shared information source. Finding relevant information on the WWW is challenging. To respond to a user query, it is difficult to search through the large number of returned documents with the presence of today’s search engines. There is a need to organize a large set of documents into categories through clustering. The documents can be a user query or simply a collection of documents. Document clustering is the task of combining a set of documents into clusters so that intra cluster documents are similar to each other than inter cluster documents. Partitioning and Hierarchical algorithms are commonly used for document clustering. Existing partitioning algorithms have the limitation…
Read More

Dynamic Personalized Recommendation on Sparse Data

Data mining, Web | Desktop Application
Dynamic Personalized Recommendation on Sparse Data Recommendation techniques are very important in the fields of E-commerce and other Web-based services. One of the main difficulties is dynamically providing high-quality recommendation on sparse data. In this paper, a novel dynamic personalized recommendation algorithm is proposed, in which information contained in both ratings and profile contents are utilized by exploring latent relations between ratings, a set of dynamic features are designed to describe user preferences in multiple phases, and finally a recommendation is made by adaptively weighting the features. Experimental results on public datasets show that the proposed algorithm has satisfying performance.
Read More

Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

Data mining, Web | Desktop Application
Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Although a number of relevant algorithms have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. Such a large number of candidate itemsets degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains lots of long transactions or long high utility itemsets. In this paper, we propose two algorithms, namely utility pattern growth (UP-Growth) and UP-Growth+, for mining high utility itemsets with a set of effective strategies for pruning candidate itemsets. The information of high utility itemsets is…
Read More

Sensitive Label Privacy Protection on Social Network Data

Data mining, Web | Desktop Application
Sensitive Label Privacy Protection on Social Network Data This paper is motivated by the recognition of the need for a ner grain and more personalized privacy in data publication of social networks. We propose a privacy protection scheme that not only prevents the disclosure of identity of users but also the disclosure of selected features in users' pro les. An individual user can select which features of her pro le she wishes to conceal. The social networks are modeled as graphs in which users are nodes and features are labels. Labels are denoted either as sensitive or as non-sensitive. We treat node labels both as background knowledge an adversary may possess, and as sensitive information that has to be protected. We present privacy protection algorithms that allow for graph data to be…
Read More

Privacy against Aggregate Knowledge Attacks

Data mining, Web | Desktop Application
Privacy against Aggregate Knowledge Attacks This paper focuses on protecting the privacy of individuals in publication scenarios where the attacker is ex- pected to have only abstract or aggregate knowledge about each record. Whereas, data privacy research usually focuses on defining stricter privacy guarantees that assume increasingly more sophisticated attack scenarios, it is also important to have anonymization methods and guarantees that will address any attack scenario. Enforcing a stricter guarantee than required increases unnecessarily the information loss. Consider for example the publication of tax records, where attackers might only know the total income, and not its con- stituent parts. Traditional anonymization methods would pro- tect user privacy by creating equivalence classes of identical records. Alternatively, in this work we propose an anonymization technique that generalizes attributes, only as much…
Read More

Adapting a Ranking Model for Domain-Specific Search

Data mining, Web | Desktop Application
Adapting a Ranking Model for Domain-Specific Search An adaptation process is described to adapt a ranking model constructed for a broad-based search engine for use with a domain-specific ranking model. It’s difficult to applying the broad-based ranking model directly to different domains due to domain differences, to build a unique ranking model for each domain it time-consuming for training models. In this paper,we address these difficulties by proposing algorithm called ranking adaptation SVM (RA-SVM), Our algorithm only requires the prediction from the existing ranking models, rather than their internal representations or the data from auxiliary domains The ranking model is adapted for use in a search environment focusing on a specific segment of online content, for example, a specific topic, media type, or genre of content. a domain-specific ranking model…
Read More

Efficient Similarity Search over Encrypted Data

Data mining, Web | Desktop Application
Efficient Similarity Search over Encrypted Data amount of data have been stored in the cloud. Although cloud based services offer many advantages, privacy and security of the sensitive data is a big concern. To mitigate the concerns, it is desirable to outsource sensitive data in encrypted form. Encrypted storage protects the data against illegal access, but it complicates some basic, yet important functionality such as the search on the data. To achieve search over encrypted data without compromising the privacy, considerable amount of searchable encryption schemes have been proposed in the literature. However, almost all of them handle exact query matching but not similarity matching; a crucial requirement for real world applications. Although some sophisticated secure multi-party computation based cryptographic techniques are available for similarity tests, they are computationally intensive…
Read More

Opinion Mining for web search

Data mining, Web | Desktop Application
Opinion Mining for web search Generally, search engine retrieves the information using Page Rank, Distance vector algorithm, crawling, etc. on the basis of the user’s query. But it may happen that the links retrieved by search engine are may or may not be exactly related to the user’s query and user has to check all the links to know whether the needed information is present in the document or not, it becomes a tedious and time consuming job for the user. Our focus is to cluster different documents based on subjective similarities and dissimilarities. Our proposed tool ‘Web Search Miner’  which is based on the concept of  user opinions mining, which uses k-means search algorithm and distance measure based on Term frequency & web document frequency for mining the search…
Read More

Distributed Association rule mining : Market basket Analysis

Data mining
Distributed Association rule mining : Market basket Analysis Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations.
Read More

web usage mining using apriori

Data mining, Web | Desktop Application
web usage mining using apriori The enormous content of information on the World Wide Web makes it obvious candidate for data mining research. Application of data mining techniques to the World Wide Web referred as Web mining where this term has been used in three distinct ways; Web Content Mining, Web Structure Mining and Web Usage Mining. E Learning is one of the Web based application where it will facing with large amount of data. In order to produce the E-Learning  portal usage patterns and user behaviors, this paper implements the high level process of Web Usage Mining using advance Association Rules algorithm  call D-Apriori Algorithm. Web Usage Mining consists of three main phases, namely Data Preprocessing, Pattern Discovering and Pattern Analysis. Server log files become a set of raw…
Read More

Sales & Inventory Prediction using Data Mining

Data mining, Web | Desktop Application
Sales & Inventory Prediction using Data Mining Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations.
Read More

Hiding Sensitive Association Rule for Privacy Preservation

Data mining, Web | Desktop Application
Hiding Sensitive Association Rule for Privacy Preservation Data mining techniques have been widely used in various applications. However, the misuse of these techniques may lead to the disclosure of sensitive information. Researchers have recently made efforts at hiding sensitive association rules. Nevertheless, undesired side effects, e.g., non sensitive rules falsely hidden and spurious rules falsely generated, may be produced in the rule hiding process. In this paper, we present a novel approach that strategically modifies a few transactions in the transaction database to decrease the supports or confidences of sensitive rules without producing the side effects. Since the correlation among rules can make it impossible to achieve this goal, in this paper, we propose heuristic methods for increasing the number of hidden sensitive rules and reducing the number of modified…
Read More

Efficiency of content distribution via network coding

Networking
Efficiency of content distribution via network coding Content distribution via network coding has received a lot of attention lately. However, direct application of network coding may be insecure. In particular, attackers can inject “bogus” data to corrupt the content distribution process so as to hinder the information dispersal or even deplete the network resource. Therefore, content verification is an important and practical issue when network coding is employed. When random linear network coding is used, it is infeasible for the source of the content to sign all the data, and hence, the traditional “hash-and-sign” methods are no longer applicable. Recently, a new on-the-fly verification technique has been proposed by Krohn et al. (IEEE S&P ’04), which employs a classical homomorphic hash function. However, this technique is difficult to be applied…
Read More

Effective Pattern Discovery for Text Mining

Data mining, Web | Desktop Application
Effective Pattern Discovery for Text Mining Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase)-based approaches should perform better than the term-based ones, but many experiments do not support this hypothesis. This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information.
Read More

Data leakage Detection

Security and Encryption, Web | Desktop Application
Data leakage Detection While doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. Another enterprise may outsource its data processing, so data must be given to various other companies. We call the owner of the data the distributor and the supposedly trusted third parties the agents. Our goal is to detect when the distributor’s sensitive data has been leaked by agents, and if possible to identify the agent that leaked the data. We consider applications where the original sensitive data cannot be perturbed. Perturbation is a very useful technique where the data is modified and made…
Read More

Medical Disease diagnosis using Data Mining

Data mining, Web | Desktop Application
Medical Disease diagnosis using Data Mining The healthcare industry collects a huge amount of data which is not properly mined and not put to the optimum use. Discovery of these hidden patterns and relationships often goes unexploited. Our research focuses on this aspect of Medical diagnosis by learning pattern through the collected data of diabetes, hepatitis and heart diseases and to develop intelligent medical decision support systems to help the physicians. In this paper, we propose the use of decision trees C4.5 algorithm, ID3 algorithm and CART algorithm to classify these diseases and compare the effectiveness, correction rate among them.
Read More