Vector-based Sentiment Analysis of Movie Reviews

Artificial Intelligence & ML, Data mining, Machine Learning
Vector-based Sentiment Analysis of Movie Reviews We investigate sentence sentiment using the Pang and Lee dataset as annotated by Socher, et al. [1]. Sentiment analysis research focuses on understanding the positive or negative tone of a sentence based on sentence syntax, structure, and content. Previous research used a tree-based model to label sentence sentiment on a scale of 5 points. Our project takes a different approach of abstracting the sentence as a vector and apply vector classification schemes. We explore two components: first, we would like to analyze the use of different sentence representations, such as bag of words, word sentiment location, negation, etc., and abstract them into a set of features. Second, we would like to classify sentence sentiment using this set of features and compare the effectiveness of…
Read More

Using Tweets for single stock price prediction

Artificial Intelligence & ML, Data mining, Machine Learning
Using Tweets for single stock price prediction Social media, as the collective form of individual opinions and emotions, has very profound though maybe subtle relationship with social events. This is particularly true when it comes to public Tweets and stock trading. In fact, research has shown that when it comes to financial decisions, people are significantly driven by emotions [1]. These emotions, together with people’s opinions, are in real-time reflected by tweets. As a result, by analyzing relevant tweets using proper machine learning algorithms, one could grasp the public’s sentiment as well as attitude towards the stock’s price of interest, which could intuitively predict the next move of it. Some previous work has been done to show that tweets can indeed reflect stock price change. Bollen. Etc (2010) randomly selected…
Read More

Learning To Predict Dental Caries For Preschool Children

Artificial Intelligence & ML, Data mining, Machine Learning
Learning To Predict Dental Caries For Preschool Children Dental caries, or tooth decay/cavity, is a dental disease caused by bacterial infection. Of people from different age groups, preschooler children requires more attention since caries has become the most common chronic childhood diseases. More importantly, a skewed distribution of the diseases has been observed in Europe, US and Singapore among the children or preschoolers, which indicate a small portion of the population endures a big portion of caries incidences. Therefore, there is still the need to improve on the current caries control to identify the high-risk individuals and prevent resurgence in children in developed countries like Singapore. Our project will study on the data such as questionnaire responses, oral examination and biological tests of certain preschoolers from Singapore and use suitable…
Read More

Predicting air pollution level in a specific city

Artificial Intelligence & ML, Data mining, Machine Learning
Predicting air pollution level in a specific city The regulation of air pollutant levels is rapidly becoming one of the most important tasks for the governments of developing countries, especially China. Among the pollutant index, Fine particulate matter (PM2.5) is a significant one because it is a big concern to people's health when its level in the air is relatively high. PM2.5 refers to tiny particles in the air that reduce visibility and cause the air to appear hazy when levels are elevated. However, the relationships between the concentration of these particles and meteorological and traffic factors are poorly understood. To shed some light on these connections, some of these advanced techniques have been introduced into air quality research. These studies utilized selected techniques, such as Support Vector Machine (SVM)…
Read More

Sentiment Analysis on Movie Reviews

Artificial Intelligence & ML, Data mining, Machine Learning
Sentiment Analysis on Movie Reviews Sentiment analysis is a well-known task in the realm of natural language processing. Given a set of texts, the objective is to determine the polarity of that text. [9] provides a comprehensive survey of various methods, benchmarks, and resources of sentiment analysis and opinion mining. The sentiments can consist of different classes. In this study, we consider two cases: 1) A movie review is positive (+) or negative (-). This is similar to [2], where they also employ a novel similarity measure. In [10], authors perform sentiment analysis after summarizing the text. 2) A movie review is very negative (- -), somewhat negative (-), neutral (o), somewhat positive (+), or very positive (+ +). For the first case, we picked a Kaggle1 competition called “Bag…
Read More

Predicting Soccer Results in the English Premier League

Artificial Intelligence & ML, Data mining, Machine Learning
Predicting Soccer Results in the English Premier League There were many displays of genius during the 2010 World Cup, ranging from Andrew Iniesta to Thomas Muller, but none were as unusual as that of Paul the Octopus. This sea dweller correctly chose the winner of a match all eight times that he was tested. This accuracy contrasts sharply with one of our team member’s predictions for the World Cup, who was correct only about half the time. Due to love of the game, and partly from the shame of being outdone by an octopus, we have decided to attempt to predict the outcomes of soccer matches. This has real world applications for gambling, coaching improvements, and journalism. Out of the many leagues we could have chosen, we decided upon the…
Read More

Classifying Online User Behavior Using Contextual Data

Artificial Intelligence & ML, Data mining, Machine Learning
Classifying Online User Behavior Using Contextual Data Despite the great computational power of machines, there a some things like interest-based segregation that only humans can instinctively distinguish. For example, a human can easily tell whether a tweet is about a book or about a kitchen utensil. However, to write a rule-based computer program to solve this task, a programmer must lay down very precise criteria for this these classifications. There has been a massive increase in the amount of structured user-generated content on the Internet in the form of tweets, reviews on Amazon and eBay etc. As opposed to stand-alone companies, which leverage their own hubs of data to run behavioral analytics, we strive to gain insights into online user behavior and interests based on free and public data. By…
Read More

Extracting Word Relationships from Unstructured Data

Artificial Intelligence & ML, Machine Learning
Extracting Word Relationships from Unstructured Data Robots are advancing rapidly in their behavioural functionality allowing them to perform sophisticated tasks. However, their ability to take Natural Language instructions is still in its infancy. Parsing, Semantic Intrepretation and Dialogue Management are typically performed only on a limited set of primitives, thus limiting the set of instructions that could be given to a robot. This limits a robot’s applicability in unconstrained natural environments (like households and offices) [8]. In this project, we are only addressing the problem of semantic interpretation of human instructions. Specifically, our Extracto algorithm provides a method to extract potential actions (verbs) that could be performed given two household objects (nouns). For example, given the nouns “Coffee” and “Cup”, Extracto identifies the action (verb) “pour” indicating that ‘coffee should…
Read More

PREDICTING HOSPITAL READMISSION SIN THE MEDICARE POPULATION

Artificial Intelligence & ML, Data mining, Machine Learning, MSC IT
PREDICTING HOSPITAL READMISSION SIN THE MEDICARE POPULATION Avoidable hospital readmissions cost taxpayers billions of dollars each year. The Medicare Payment Advisory Commission has estimated that almost $12 billion is spent annually by Medicare on potentially preventable readmissions within 30 days of a patient’s discharge from a hospital [1]. The Medicare program has begun to apply financial penalties to hospitals that have excessive risk-adjusted readmission rates. There is much interest in the health policy and medical communities in the ability to accurately predict which patients are at high risk of being readmitted. Not only are there strong financial reasons to avoid readmissions, readmission to the hospital can be a sign of poor clinical care and can indicate a worsening of a patient’s condition [2]. If doctors and nurses were aware of…
Read More

Collaborative Filtering Recommender Systems

Artificial Intelligence & ML, Data mining, Machine Learning
Collaborative Filtering Recommender Systems Collaborative filtering (CF) predicts user preferences in item selection based on the known user ratings of items. As one of the most common approach to recommender systems, CF has been proved to be effective for solving the information overload problem. CF can be divided into two main branches: memory-based and model-based. Most of the present researches improve the accuracy of Memory-based algorithms only by improving the similarity measures. But few researches focused on the prediction score models which we believe are more important than the similarity measures. The most well-known algorithm to model-based is the matrix factorization. Compared to the memory-based algorithms, matrix factorization algorithm generally has higher accuracy. However, the matrix factorization may fall into local optimum in the learning process which leads to inadequate…
Read More

Blowing up the Twittersphere: Predicting the Optimal Time to Tweet

Artificial Intelligence & ML, Data mining, Machine Learning
Blowing up the Twittersphere: Predicting the Optimal Time to Tweet We can separate our problem into a few different steps. First, we need to model information about a tweet and how successful a given tweet is. Second, given a tweet, user, and post time, we must predict how successful that tweet will be. Finally, we then need to use our predictor to determine the optimal time for a given user to post a specific tweet, i.e. what time maximizes our success prediction for a specific user and tweet. We considered two papers that address similar problems of using Machine Learning to understand interactions in social media and predict success of online content. Lakkaruja, McAuley, and Leskovec consider the connections between title, content and community in social media. From their work,…
Read More

Recognition and Classification of Fast Food Images

Artificial Intelligence & ML, Data mining, Machine Learning
Recognition and Classification of Fast Food Images Food recognition is of great importance nowadays for multiple purposes. On one hand, for people who want to get a better understanding of the food that they are not familiar of or they haven’t even seen before, they can simply take a picture and get to know more details about it. On the other hand, the increasing demand for dietary assessment tools to record the calorie and nutrition has also been a driving force in the development of food recognition technique. Therefore, automatic food recognition is very important and has great application potential. However, food varies greatly in appearance (e.g., shape, colors) with tons of different ingredients and assembling methods. This makes food recognition a difficult task for current state-of-the-art classification methods, and…
Read More

Predicting Heart Attacks

Artificial Intelligence & ML, Data mining, Machine Learning
Predicting Heart Attacks In the field of Medical Science, there are a huge amount of data. Data mining techniques are being used to discover hidden pattern form these data. Advance data mining techniques have been developed nowadays. The efficiency of these techniques is compared with sensitivity, specificity, accuracy and error rate. Some well known Data mining classification techniques, Decision Tree, Artificial neural networks, and Support Vector Machine and Naïve Bayes Classifier. In this paper, we introduce a new method based on the fitness value of the attribute to predict the heart disease problem. We use 10 attributes for our proposed method and use simple calculation. In our everyday life, there are several example exit where we have to analyze the historical data, for example, a bank loans officer needs analysis…
Read More

E-Commerce Sales Prediction Using Listing Keywords

Artificial Intelligence & ML, Data mining, Machine Learning
E-Commerce Sales Prediction Using Listing Keywords Small online retailers usually set themselves apart from brick and mortar stores, traditional brand names, and giant online retailers by offering goods at an exceptional value. In addition to price, they compete for shoppers’ attention via descriptive listing titles, whose effectiveness as search keywords can help drive sales. In this study, machine learning techniques will be applied to online retail data to measure the link between keywords and sales volumes. Architecture Research Paper Link: Download Paper
Read More

Prediction and Classification of Cardiac Arrhythmia

Artificial Intelligence & ML, Data mining, Machine Learning
Prediction and Classification of Cardiac Arrhythmia Irregularity in heartbeat may be harmless or life-threatening. Hence both accurate detection of the presence, as well as classification of arrhythmia, are important. Arrhythmia can be diagnosed by measuring the heart activity using an instrument called ECG or electrocardiograph and then analyzing the recorded data. Different parameter values can be extracted from the ECG waveforms and can be used along with other information about the patient like age, medical history, etc to detect arrhythmia. However, sometimes it may be difficult for a doctor to look at these long-duration ECG recordings and find minute irregularities. Therefore, using machine learning for automating arrhythmia diagnosis can be very helpful. The project aims at using different machine learning algorithms like Naive Bayes, SVM, Random Forests and Neural Networks…
Read More

Sentiment Analysis for Hotel Reviews

Artificial Intelligence & ML, Data mining, Machine Learning
Sentiment Analysis for Hotel Reviews Travel planning and hotel booking on the website have become one of an important commercial use. Sharing on the web has become a major tool in expressing customer thoughts about a particular product or Service. Recent years have seen rapid growth in online discussion groups and review sites (e.g.www.tripadvisor.com) where a crucial characteristic of a customer’s review is their sentiment or overall opinion — for example, if the review contains words like ‘great’, ‘best’, ‘nice’, ‘good’, ‘awesome’ is probably a positive comment. Whereas if reviews contain words like ‘bad’, ‘poor’, ‘awful’, ‘worse’ is probably a negative review. However, Trip Advisor’s star rating does not express the exact experience of the customer. Most of the ratings are meaningless, a large chunk of reviews fall in the…
Read More

Mood Detection with Tweets

Artificial Intelligence & ML, Data mining, Machine Learning
Mood Detection with Tweets Emotional states of individuals, also known as moods, are central to the expression of thoughts, ideas, and opinions, and in turn, impact attitudes and behavior. Social media tools like Twitter is increasingly used by individuals to broadcast their day-to-day happenings or to report on an external event of interest, understanding the rich „landscape‟ of moods will help us better to interpret millions of individuals. This paper describes a Rule-Based approach, which detects the emotion or mood of the tweet and classifies the twitter message under the appropriate emotional category. The accuracy with the system is 85%. With the proposed system it is possible to understand the deeper levels of emotions i.e., finer grained instead of sentiment i.e., coarse-grained. The sentiment says whether the tweet is positive…
Read More

Parking Occupancy Prediction and Pattern Analysis

Artificial Intelligence & ML, Data mining, Machine Learning
Parking Occupancy Prediction and Pattern Analysis According to the Department of Parking and Traffic, San Francisco has more cars per square mile than any other city in the US [1]. The search for an empty parking spot can become an agonizing experience for the city’s urban drivers. A recent article claims that drivers cruising for a parking spot in SF generate 30% of all downtown congestion [2]. These wasted miles not only increase traffic congestion, but also lead to more pollution and driver anxiety. In order to alleviate this problem, the city armed 7000 metered parking spaces and 12,250 garages spots (total of 593 parking lots) with sensors and introduced a mobile application called SFpark [3], which provides real time information about availability of a parking lot to drivers. However,…
Read More

Predicting Usefulness of Yelp Reviews

Artificial Intelligence & ML, Data mining, Machine Learning
Predicting the Usefulness of Yelp Reviews The Yelp Dataset Challenge makes a huge set of user, business, and review data publicly available for machine learning projects. They wish to find interesting trends and patterns in all of the data they have accumulated. Our goal is to predict how useful a review will prove to be to users. We can use review upvotes as a metric. This could have immediate applications – many people rely on Yelp to make consumer choices, so predicting the most helpful reviews to display on a page before they have actually been rated would have a serious impact on user experience. Research Paper Link: Download Paper
Read More

Multiclass Classifier Building with Amazon Data to Classify Customer Reviews into Product Categories

Artificial Intelligence & ML, Data mining, Machine Learning
Multiclass Classifier Building with Amazon Data to Classify Customer Reviews into Product Categories - E-commerce refers to the Electronic Commerce and defined as buying and selling of products over electronic systems such as the Internet. With the widespread use of the Internet, the trade conducted electronically (online) has grown extraordinarily. The E-commerce companies have a large database of products and a number of consumers that use these data. To address this data and information explosion, e-commerce stores are applying machine learning to identify and customize the product category information. Data scientists in this field are utilizing machine learning potential to build unmatched competitiveness in the market by finding purchase preferences, customer churn and product suggestions etc. Applying popular Machine Learning algorithms to huge datasets brought new challenges for the ML…
Read More

Practical Approximate k Nearest Neighbor Queries with Location and Query Privacy

Artificial Intelligence & ML, Data mining, Machine Learning
Practical Approximate k Nearest Neighbor Queries with Location and Query Privacy In mobile communication, spatial queries pose a serious threat to user location privacy because the location of a query may reveal sensitive information about the mobile user. In this paper, we study approximate k nearest neighbor (KNN) queries where the mobile user queries the location-based service (LBS) provider about approximate k nearest points of interest (POIs) on the basis of his current location. We propose a basic solution and a generic solution for the mobile user to preserve his location and query privacy in approximate kNN queries. The proposed solutions are mainly built on the Paillier public-key cryptosystem and can provide both location and query privacy. To preserve query privacy, our basic solution allows the mobile user to retrieve…
Read More

QUANTIFYING POLITICAL LEANING FROM TWEETS, RETWEETS, AND RETWEETERS

Artificial Intelligence & ML, Data mining, Machine Learning, MSC IT
QUANTIFYING POLITICAL LEANING FROM TWEETS, RETWEETS, AND RETWEETERS In recent years, big online social media data have found many applications in the intersection of political and computer science. Examples include answering questions in political and social science (e.g., proving/disproving the existence of media bias [3, 30] and the “echo chamber” effect [1, 5]), using online social media to predict election outcomes [46, 31], and personalizing social media feeds so as to provide a fair and balanced view of people’s opinions on controversial issues [36]. A prerequisite for answering the above research questions is the ability to accurately estimate the political leaning of the population involved. If it is not met, either the conclusion will be invalid, the prediction will perform poorly [35, 37] due to a skew towards highly vocal…
Read More

Efficient Algorithms for Mining Top-K High Utility Itemsets

Artificial Intelligence & ML, Data mining, Machine Learning
Efficient Algorithms for Mining Top-K High Utility Itemsets In recent years, shopping online is becoming more and more popular. When it needs to decide whether to purchase a product or not online, the opinions of others become important. It presents a great opportunity to share our viewpoints for various products purchase. However, people face the information overloading problem. How to mine valuable information from reviews to understand a user’s preferences and make an accurate recommendation is crucial. Traditional recommender systems consider some factors, such as user’s purchase records, product category, and geographic location. In this work, it proposes a sentiment-based rating prediction method to improve prediction accuracy in recommender systems. Firstly, it proposes a social user sentimental measurement approach and calculates each user’s sentiment on items. Secondly, it not only…
Read More

Efficient Algorithms for Mining Top-K High Utility Itemsets

Artificial Intelligence & ML, Data mining, Machine Learning
Efficient Algorithms for Mining Top-K High Utility Itemsets Mining high utility itemsets from databases is an emerging topic in data mining, which refers to the discovery of itemsets with utilities higher than a user-specified minimum utility threshold min_util. Although several studies have been carried out on this topic, setting an appropriate minimum utility threshold is a difficult problem for users. If min_util is set too low, too many high utility itemsets will be generated, which may cause the mining algorithms to become inefficient or even run out of memory. On the other hand, if min_util is set too high, no high utility itemset will be found. Setting appropriate minimum utility thresholds by trial and error is a tedious process for users. In this paper, we address this problem by proposing…
Read More

Crowd sourcing for Top-K Query Processing over Uncertain Data

Artificial Intelligence & ML, Data mining
Crowdsourcing for Top-K Query Processing over Uncertain Data Querying uncertain data has become a prominent application due to the proliferation of user-generated content from social media and of data streams from sensors. When data ambiguity cannot be reduced algorithmically, crowdsourcing proves a viable approach, which consists in posting tasks to humans and harnessing their judgment for improving the confidence about data values or relationships. This paper tackles the problem of processing top-K queries over uncertain data with the help of crowdsourcing to quickly converging to the real ordering of relevant results. Several offline and online approaches for addressing questions to a crowd are defined and contrasted on both synthetic and real datasets, with the aim of minimizing the crowd interactions necessary to find the real ordering of the result set.…
Read More

Cyberbullying Detection based on Semantic-Enhanced Marginalized Denoising Auto-Encoder

Artificial Intelligence & ML, Data mining
Cyberbullying Detection based on Semantic-Enhanced Marginalized Denoising Auto-Encoder As a side effect of increasingly popular social media, cyberbullying has emerged as a serious problem afflicting children, adolescents, and young adults. Machine learning techniques make automatic detection of bullying messages in social media possible, and this could help to construct a healthy and safe social media environment. In this meaningful research area, one critical issue is robust and discriminative numerical representation learning of text messages. In this paper, we propose a new representation learning method to tackle this problem. Our method named semantic-enhanced marginalized denoising auto-encoder (smSDA) is developed via a semantic extension of the popular deep learning model stacked denoising autoencoder (SDA). The semantic extension consists of semantic dropout noise and sparsity constraints, where the semantic dropout noise is designed…
Read More

Mining Facets For Queries From Their Search Results

Artificial Intelligence & ML, Data mining, Machine Learning
Mining Facets For Queries From Their Search Results A query facet is a set of items which describe and summarize one important aspect of a query. Here a facet item is typically a word or a phrase. A query may have multiple facets that summarize the information about the query from different perspectives. For the query “watches”, its query facets cover the knowledge about watches in five unique aspects, including brands, gender categories, supporting features, styles, and colors. The query “visit Beijing” has a facet about popular resorts in Beijing (Tiananmen square, forbidden city, summer palace, ...) and a facet on several travel-related topics (attractions, shopping, dining, ...). Query facets provide interesting and useful knowledge about a query and thus can be used to improve search experiences in many ways.…
Read More

Sentiment Analysis of Top Colleges in India Using Twitter Data

Artificial Intelligence & ML, Data mining, Machine Learning
Sentiment Analysis of Top Colleges in India Using Twitter Data Social Media has captured the attention of the entire world as it is thundering fast in sending thoughts across the globe, user-friendly and free of cost requiring only a working internet connection. People are extensively using this platform to share their thoughts loud and clear. Twitter is one such well-known micro-blogging site getting around 500 million tweets per day. Each user has a daily limit of 2,400 tweets and 140 characters per tweet. Twitter users post (or ‘tweet’) every day about various subjects like products, services, day to day activities, places, personalities etc. Hence, Twitter data is of great germane as it can be used in various scenarios where companies or brands can utilize a direct connection to almost each…
Read More

FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce

Artificial Intelligence & ML, Data mining, Hadoop
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce Data mining is a process of discovering the pattern from the huge amount of data. There are many data mining technics like clustering, classification and association rule. The most popular one is the association rule that is divided into two parts i) generating the frequent itemset ii) generating association rule from all itemsets. Frequent itemset mining (FIM) is the core problem in the association rule mining. Sequential FIM algorithm suffers from performance deterioration when it operated on a huge amount of data on a single machine.to address this problem parallel FIM algorithms were proposed. There are two types of algorithms that can be used for mining the frequent itemsets first method is the candidate-itemset generation approach and without candidate itemset generation algorithm.…
Read More

Seo optimizer and suggester

Data mining, Web | Desktop Application
Seo optimizer and suggester The seo optimization and suggestion consists of an entire search engine for analyzing and ranking websites and also suggesting seo tips. The search engine analyzes websites and ranks them accordingly. A website grading algorithm allows the search engine to appropriately read and access the website content. It analyzes and stores analytic data for various websites. This data is used to rank the website accordingly. The seo suggestion page is used to provide seo tips for a website. The suggester consists of a website box to enter the website url. Once entered, the system crawls the website analyzes its data and provides appropriate solutions for optimizing it for better seo performance.The software project points out various drawbacks in the website and provides tips solutions for the same.
Read More

Image Mining Project

Data mining, Image Processing, Web | Desktop Application
Image Mining Project This software project concentrates on improved search for images. Usually we find systems that efficiently provide data mining functionality. This includes searching by comparing with text data. This text data is easy to mine since we just compare the words (alphabet combinations) to the words in our database. Well when it comes to images , most of the systems use data mining to search images based on image alt attribute or title that is the text associated to the image. Well this system searches images based on the image patterns and graphical methods, comparing images graphically to find a match between image color values. This efficient image mining system utilizes graphical pattern matching techniques and an algorithm for fast and approximate image retrieval using C#.
Read More

Smart Health consulting system

Data mining, Web | Desktop Application
Smart Health consulting system This system aims at maintaining patient health records and even getting appointments from various doctors for related treatments. The system user must register as a member of this system and keep updating his medical history. Patients can then select from a list of specialized doctors for respective treatments such as (skin specialist, ENT specialist cardiologist etc) at particular locations. Patients may also select suitable appointment timings for their meeting.
Read More

Car Sales And Inventory Store Project

Data mining, Web | Desktop Application
Car Sales And Inventory Store Project This is an online car and car parts store that has listings of various cars along with their features. It also consists of car parts and accessories. The project allows users to buy car and car inventory online. It allows users to check various car stats including car engine, milage, tank capacity and other factors. Credit card payment facility available for car parts. Car booking has other methods for booking and registration and even a test drive registration. The project features : Visitor Registration/ Login module. User may check various car listing with features. User may check the car features and inventory parts. User may select and add products to shopping cart. Credit card payment option for car parts shopping. Test drive booking registration…
Read More

Smart Health Consulting Project

Data mining, Web | Desktop Application
Smart Health Consulting Project This system aims at maintaining patient health records and even getting appointments from various doctors for related treatments. The system user must register as a member of this system and keep updating his medical history. Patients can then select from a list of specialized doctors for respective treatments such as (skin specialist, ENT specialist cardiologist etc) at particular locations. Patients may also select suitable appointment timings for their meeting. This Project contains 7 useful areas: i. General User area ii. Doctor?s area iii. Patient?s area iv. Transaction/Blling area v. Administrator area vi. Pharmacy area vii. Insurance area
Read More

Farming Assistance Web Service

Data mining, Web | Desktop Application
Farming Assistance Web Service A Web project to help farmers ensure greater profitability through direct farmer to supplier and farmer to farmer communication. This service boosts business communication and brings transparency in the system. This innovative site allows for good farmer, retailer and supplier communication. It allows farmers to login and communicate to respective dealers. When dealers publish an advertisement or offer, the respective farmers get notified via Sms message. The farmers may also submit their grievance and complaints to respective dealers or authorities using their farmer login on a separate complaints page and authorities will get access to that page regularly using their login id and passwords.
Read More

Banking Bot Project

Data mining, Web | Desktop Application
Banking Bot Project A banking bot project is built using artificial algorithms that analyzes user?s queries and understand user?s message. The system is designed for banks use where users can ask any bank related questions like loan, account, policy etc. This application is developed for android devices. The system recognizes user?s query and understands what he wants to convey and simultaneously answers them appropriately. The questions asked by the users can be in any format. There is no specific format for users to ask questions. The built in artificial intelligence system realizes users requirements and provides suitable answers to the user. It also uses a graphical representation of a person speaking while giving answers as a real person would do.
Read More

The Cibil System Project

Data mining, Web | Desktop Application
The Cibil System Project A cibil system to keep tracks of peoples credit scores and dues. The system is similar to the real cibil system with an enhancement(Here the defaulter can view his status and can apply for improvement by good behavior). Our software system consists of admin login ,Cibil associates login and individual login. Here cibil associates are banks or companies who want to report faulty members. They may send faulty member data and this is passed on to the cibil admin. The cibil admin can view the data and approve it to be added after inspection. The system also consists of member login for which a member first needs to register. He can then check if he is listed in the black list and for what. The member…
Read More

Medical Search Engine Project

Data mining, Web | Desktop Application
Medical Search Engine Project Version 1: J2ME Mobile app Version 2: HTML5, JqueryMobile based Mobile web app (Works On Android phones, iOS devices, Windows and also for Blackberry phones)Description: This application is a simple mobile application which enables user to get information about symtomps of a disease, medicine recommended for curing the disease and list of medical shops where user can get the medicines. Simple UI helps User to find the information required. More information about a disease, symtomps and medical shops can be added by simply adding entries in the Database. It is a client server application, hence same servlet can be used to serve request from multiple clients. We have this functionality in two different versions(developed using two different types of technology) as you need.
Read More

Artificial Intelligence Dietician

Data mining, Web | Desktop Application
Artificial Intelligence Dietician The online artificial dietician is a bot with artificial intelligence about human diets. It acts as a diet consultant similar to a real dietician. Dieticians are educated with nutrient value of foods. A dietician consults a person based on his schedule, body type, height and weight. The system too asks all this data from the user and processes it. It asks about how many hour the user works, his height, weight, age etc. The system stores and processes this data and then calculates the nutrient value needed to fill up users needs. The system then shows an appropriate diet to the users and asks if user is ok with it, else it shows other alternate diets to fill up users needs.
Read More

Web Mining For Suspicious Keyword Prominence

Data mining, Web | Desktop Application
Web Mining For Suspicious Keyword Prominence Web mining can be termed as an information mining method to naturally search, collect and organize data from indexed online records which might be in various organized, unstructured or semi-organized structure. We usually use web mining techniques in order to assess the viability of a specific web page/entity in order to figure out various factors related to it. This project consolidates the best researched mechanisms from the semantic web and synaptic web at low entropy in order to build structural engineering of Semantic-Synaptic web mining. Our proposed project aims at web mining for finding out density of selected keywords in order to check its keyword prominence on those web pages. This is an important factor in various fields in order to check the prominence…
Read More

Customer Behaviour Prediction Using Web Usage Mining

Data mining, Web | Desktop Application
Customer Behaviour Prediction Using Web Usage Mining Web usage mining involves first recording behavior and flow of customers on a website and then mining through this data for behavioural patterns. It is an important part of ecommerce world that allows websites to go through previously recorded web traffic data. Ecommerce sites analyse this data in order to provide better performance and also suggest better products and services to customers by identifying them next time. The system is tuned to record web shopping/buying patterns and track various analytics data that tend to provide future prediction statistics. The system scans for user budget tracking, tallying to previous years, user bounce rates- number of users returning from payment page and other site usage factors. Factors like returning users allow site owners to make…
Read More

Web Server Log Analysis System

Data mining, Web | Desktop Application
Web Server Log Analysis System Web usage data is of prime importance these days. Web pages used on a day to day basis and various users logging on to a website are two major data categories of prime importance. Here we propose a web mining algorithm that proves better that most traditional web mining algorithms. We here track web data and use the E web miner algorithm for web log analysis and reporting. The algorithm works on Ecommerce data continuously scanning and going through the web log looking for patterns as suggested by user conditions. The algorithm is designed to look for various patterns that appear to be in any logical order. It is built to provide analytics data according to predefined algorithms built in for maximum performance and minimum…
Read More

User Web Access Records Mining For Business Intelligence

Data mining, Web | Desktop Application
User Web Access Records Mining For Business Intelligence In this project we analyse how business intelligence on a website could be obtained from user’s access records instead of web logs of “hits”. User’s access records are captured by implementing a data mining algorithm on the website. User mostly browses those products in which he is interested. This system will capture user’s browsing pattern using data mining algorithm. This system is a web application where user can view various resources on the website. User will register their profile in an exchange of a password. User will get user ID and password in order to access the system. Once the user login’s to the system user will gain access to certain resources on the website. The links to the resources on the…
Read More

Biomedical Data Mining For Web Page Relevance Checking

Data mining, Web | Desktop Application
Biomedical Data Mining For Web Page Relevance Checking Data mining is a technique used to mine out useful data and patterns from large data sets and make the most use of obtained results. Web mining and data mining go hand in hand when creating web mining systems. Web mining includes text mining methodologies that allow for usage reading from and classification based on unstructured data. Text mining allows us to detect patterns, keywords and relevant information in unstructured texts. Web mining and data mining systems each have their own uses. Data mining algorithms are efficient at manipulating organized data sets, while web mining algorithms are widely used to scan and mine from unorganized and unstructured web pages and text data available on the internet. Websites created in various platforms have…
Read More

Data Mining For Automated Personality Classification

Data mining, Web | Desktop Application
Data Mining For Automated Personality Classification We come across areas where we have access to large amounts of person behavioural data. This data can help us classify persons using Automated personality classification (APC). In this project, we propose an advanced APC – automated personality classification system. We here use learning algorithms along with advanced data mining to mine user characteristics data and learn from the patterns. This learning can now be used to classify/predict user personality based on past classifications. The system analyses vast user characteristics and behaviours and based on the patterns observed, it stores its own user characteristics patterns in a database. The system now predicts new user personality based on personality data stored by classification of previous user data. This system is useful to social networks as…
Read More

Real Estate Search Based On Data Mining

Data mining, Web | Desktop Application
Real Estate Search Based On Data Mining This project helps the users to make good decisions regarding buying or selling of valuable property. Prior to this online system this process involved a lot of travelling costs and searching time. Due to this system the user now does not have to travel much and can look for the property it is searching for, online according to its requirements. This system includes property details like Address, space measurement(sq ft), number of BHKs, Floor, Property Seller name and its contact number plus email-id. The user can search property depending on the area that it wants in, number of wash rooms, bedrooms, halls and kitchen. The system contains an algorithm that calculates loan that the user can take plus 20%-30% cash that the user…
Read More

Detecting E Banking Phishing Websites Using Associative Classification

Data mining, Web | Desktop Application
Detecting E Banking Phishing Websites Using Associative Classification There are number of users who purchase products online and make payment through e- banking. There are e- banking websites who ask user to provide sensitive data such as username, password or credit card details etc often for malicious reasons. This type of e-banking websites is known as phishing website. In order to detect and predict e-banking phishing website. We proposed an intelligent, flexible and effective system that is based on using classification Data mining algorithm. We implemented classification algorithm and techniques to extract the phishing data sets criteria to classify their legitimacy. The e-banking phishing website can be detected based on some important characteristics like URL and Domain Identity, and security and encryption criteria in the final phishing detection rate. Once…
Read More

E Commerce Product Rating Based On Customer Review Mining

Data mining, Web | Desktop Application
E Commerce Product Rating Based On Customer Review Mining There are many users who purchase products through E-commerce websites. Through online shopping many E-commerce enterprises were unable to know whether the customers are satisfied by the services provided by the firm. This boosts us to develop a system where various customers give reviews about the product and online shopping services, which in turn help the E-commerce enterprises and manufacturers to get customer opinion to improve service and merchandise through mining customer reviews. An algorithm could be used to track and manage customer reviews, through mining topics and sentiment orientation from online customer reviews. In this system user will view various products and can purchase products online. Customer gives review about the merchandise and online shopping services. Certain keywords mentioned in…
Read More

Weather Forecasting Using Data Mining

Data mining, Web | Desktop Application
Weather Forecasting Using Data Mining Weather forecasting is the application of science and technology to predict the state of the atmosphere for a given location. Ancient weather forecasting methods usually relied on observed patterns of events, also termed pattern recognition. For example, it might be observed that if the sunset was particularly red, the following day often brought fair weather. However, not all of these predictions prove reliable. Here this system will predict weather based on parameters such as temperature, humidity and wind. This system is a web application with effective graphical user interface. User will login to the system using his user ID and password. User will enter current temperature; humidity and wind, System will take this parameter and will predict weather from previous data in database. The role…
Read More

Opinion Mining For Restaurant Reviews

Data mining, Web | Desktop Application
Opinion Mining For Restaurant Reviews Here we propose an advanced Restaurant Review system that detects hidden sentiments in feedback of the customer and rates the restaurant accordingly. The system uses opinion mining methodology in order to achieve desired functionality. Opinion Mining for Restaurant Reviews is a web application which gives review of the feedback that is posted. The System takes feedback of various users, based on the opinion, system will specify whether the posted restaurant is good, bad, or worst. We use a database of sentiment based keywords along with positivity or negativity weight in database and then based on these sentiment keywords mined in user feedback is ranked. Once the user login to the system he views the restaurant and gives feedback about the restaurant. System will use database…
Read More

Opinion Mining For Comment Sentiment Analysis

Data mining, Web | Desktop Application
Opinion Mining For Comment Sentiment Analysis Here we propose an advanced Comment Sentiment Analysis system that detects hidden sentiments in comments and rates the post accordingly. The system uses opinion mining methodology in order to achieve desired functionality. Opinion Mining for Comment Sentiment Analysis is a web application which gives review of the topic that is posted by the user. The System takes comments of various users, based on the opinion, system will specify whether the posted topic is good, bad, or worst. We use a database of sentiment based keywords along with positivity or negativity weight in database and then based on these sentiment keywords mined in user comment is ranked. Once the user logins to the system, user can view his own status as well as he can…
Read More

Movie Success Prediction Using Data Mining

Data mining, Web | Desktop Application
Movie Success Prediction Using Data Mining In this system we have developed a mathematical model for predicting the success class such as flop, hit, super hit of the movies. For doing this we have to develop a methodology in which the historical data of each component such as actor, actress, director, music that influences the success or failure of a movie is given is due to weight age and then based on multiple thresholds calculated on the basis of descriptive statistics of dataset of each component it is given class flop, hit, super hit label. Admin will add the film crew data. Admin will add movies data of a particular film crew. Admin will add new movie data along with film crew details as well as release date of the…
Read More

Monitoring Suspicious Discussions On Online Forums Using Data Mining

Data mining, Web | Desktop Application
Monitoring Suspicious Discussions On Online Forums Using Data Mining People now-a-days are very fond of using internet as a discussion medium. As internet technology had been increasing more and more, this technology led to many legal and illegal activities. It is found that much first-hand news has been discussed in Internet forums well before they are reported in traditional mass media. This communication channel provides an effective channel for illegal activities such as dissemination of copyrighted movies, threatening messages and online gambling etc. The law enforcement agencies are looking for solutions to monitor these discussion forums for possible criminal activities and download suspected postings as evidence for investigation. We propose a system which will tackle this problem effectively. In this project we had used a data mining algorithm to detect…
Read More

Web Data Mining To Detect Online Spread Of Terrorism

Data mining, Web | Desktop Application
Web Data Mining To Detect Online Spread Of Terrorism Terrorism has grown its roots quite deep in certain parts of the world. With increasing terrorist activities it has become important to curb terrorism and stop its spread before a certain time. So as identified internet is a major source of spreading terrorism through speeches and videos. Terrorist organizations use internet to brain wash individuals and also promote terrorist activities through provocative web pages that inspire helpless people to join terrorist organizations. So here we propose an efficient web data mining system to detect such web properties and flag them automatically for human review. Data mining is a technique used to mine out patterns of useful data from large data sets and make the most use of obtained results. Data mining…
Read More

Opinion Mining For Social Networking Site

Data mining, Web | Desktop Application
Opinion Mining For Social Networking Site This system uses opinion mining methodology in order to achieve desired functionality. Opinion Mining for Social Networking Site is a web application. Here the user will post his views related to some subject other users will view this post and will comment on this post. The System takes comments of various users, based on the opinion, system will specify whether the posted topic is good, bad, or worst. User can change his own profile picture and can update his status. These changes can be viewed by various users. We use a database of sentiment based keywords along with positivity or negativity weight in database and then based on these sentiment keywords mined in user comment is ranked. Once the user logins to the system,…
Read More

Website Evaluation Using Opinion Mining

Data mining, Web | Desktop Application
Website Evaluation Using Opinion Mining Here we propose an advanced Website Evaluation system that rates the website based on the opinion of the user. Website will be evaluated based on factors such genuineness of the website, timely delivery of the product after online transaction and support provided by the website. User will comment about the website, based on the comment system will rate the website. The system takes opinion of various users, based on the opinion; system will decide whether the website is genuine or not. The system uses opinion mining methodology in order to achieve desired functionality. We use a database of sentiment based keywords along with positivity or negativity weight in database and then based on these sentiment keywords mined in user comment is ranked. The system contains…
Read More

Employee Hourly Attendance By Barcode Scan

Data mining, Web | Desktop Application
Employee Hourly Attendance By Barcode Scan The proposed project is a system that keeps a track of employees’ attendance using barcode scanner. This concept set forth to automate the traditional attendance system of taking signature by using authentication technique. The traditional system requires a register maintained for manually signing the attendance by the employees which is time consuming. Hence this proposed project eliminates the need of maintaining attendance sheet.The proposed system uses barcode method for authenticating employees with a unique barcode that represents their unique id. Every employee will have their attendance card. They have to scan their cards using barcode scanner and the system notes down their attendance as per date and time. System stores employee’s attendance details and generates brief report for admin as required. Such kind of…
Read More

Smart Health Prediction Using Data Mining

Data mining, Web | Desktop Application
Smart Health Prediction Using Data Mining It might have happened so many times that you or someone yours need doctors help immediately, but they are not available due to some reason. The Health Prediction system is an end user support and online consultation project. Here we propose a system that allows users to get instant guidance on their health issues through an intelligent health care system online. The system is fed with various symptoms and the disease/illness associated with those systems. The system allows user to share their symptoms and issues. It then processes users symptoms to check for various illness that could be associated with it. Here we use some intelligent data mining techniques to guess the most accurate illness that could be associated with patient’s symptoms. If the…
Read More

Product Review Analysis For Genuine Rating

Data mining, Web | Desktop Application
Product Review Analysis For Genuine Rating Here we propose an advanced Products Review analysis system which provides a platform to registered users to rate a particular or multiple products using this system. The system uses product review analysis in order to achieve desired functionality. Product review analysis is a web application which consist multiple products added by admin to review to rate and review them. The System takes reviews of various users, based on their personal opinion, system will specify whether the posted product is good, bad, or worst. We use a database of sentiment based keywords along with positivity or negativity weight in database and then based on these sentiment keywords mined in user review is ranked. Once the user login to the system he views multiple products and…
Read More

Sentiment Based Movie Rating System

Data mining, Networking, Web | Desktop Application
Sentiment Based Movie Rating System We usually come across movie rating websites where users are allowed to rate ad comment on movies online. These ratings are provided as input to the website rating system. The admin then checks reviews, critic’s ratings and displays an online rating for every movie. Here we propose an online system that automatically allows users to post reviews and stores them to rate movies based on user sentiments. The system now analyzes this data to check for user sentiments associated with each comment. Our system consists of a sentiment library designed for English as well as hindi sentiment analysis. The system breaks user comments to check for sentimental keywords and predicts user sentiment associated with it. Once the keywords are found it associates the comment with…
Read More

Heart Disease Prediction Project

Data mining, Multimedia, Web | Desktop Application
Heart Disease Prediction Project It might have happened so many times that you or someone yours need doctors help immediately, but they are not available due to some reason. The Heart Disease Prediction application is an end user support and online consultation project. Here, we propose a web application that allows users to get instant guidance on their heart disease through an intelligent system online. The application is fed with various details and the heart disease associated with those details. The application allows user to share their heart related issues. It then processes user specific details to check for various illness that could be associated with it. Here we use some intelligent data mining techniques to guess the most accurate illness that could be associated with patient’s details. Based on…
Read More

Topic Detection Using Keyword Clustering

Data mining, Networking, Parallel And Distributed System, Web | Desktop Application
Topic Detection Using Keyword Clustering To find prominent topic in a collection of documents. We here propose a system to detect topic from a collection of document. We use an efficient method to discover topic in a collection of documents known as topic model. A topic model is a type of statistical model for discovering topics from collection of documents. One would expect particular words to appear in the document more or less frequently: “dog” and “bone” will appear more often in documents about dogs, “cat” and “meow” will appear in documents about cats, and “the” and “is” will appear equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about…
Read More

A New Hybrid Technique For Data Encryption

Networking, Security and Encryption, Web | Desktop Application
A New Hybrid Technique For Data Encryption Data encryption has been widely applied in many data processing areas. Various encryption algorithms have been developed for processing text documents, images, video, etc. If we are able to collaborate the advantages of the different existing encryption methods, then a new hybrid encryption method can be developed which offers better security and protection. So, in order to accomplish the Hybrid encryption technique, data encryption techniques using Fibonacci series, XOR logic, PN sequence are studied, analyzed and their performance is compared in this paper. The message is divided into three parts and these three different techniques are applied to these parts and the performance is again analyzed. The application of these three different methods to different parts of the same message along with two…
Read More

Detecting Fraud Apps Using Sentiment Analysis

Data mining, Networking, Web | Desktop Application
Detecting Fraud Apps Using Sentiment Analysis Most of us use android and IOS Mobiles these days and also uses the play store or app store capability normally. Both the stores provide great number of application but unluckily few of those applications are fraud. Such applications dose damage to phone and also may be data thefts. Hence, such applications must be marked, so that they will be identifiable for store users. So we are proposing a web application which will process the information, comments and the review of the application. So it will be easier to decide which application is fraud or not. Multiple application can be processed at a time with the web application. Also User cannot always get correct or true reviews about the product on internet. So rating/comments…
Read More

TV Show Popularity Analysis Using Data Mining

Data mining, Web | Desktop Application
TV Show Popularity Analysis Using Data Mining Reality TV is the new mantra of television producers and channel executives. It is the means to increase TRP ratings and the end is always to outdo the other channels and the “similar -but-tweaked-here-and-there” shows churned out by the competition. Most of the television shows which are being telecast nowadays are reality shows specializing in dancing, singing, and acting. We conclude to build such a system that will recognize people’s sentimental comments on TV shows. The comments from the viewer will be extracted along with the viewer details such as gender, location, etc…The comments will be gathered from various sources and the entry will be maintained into the excel sheet. The excel file will contain peoples name, email id, age, gender, location and…
Read More

Cancer Prediction Using Data Mining

Data mining, Web | Desktop Application
Cancer Prediction Using Data Mining It might have happened so many times that you or someone yours need doctors help immediately, but they are not available due to some reason. The Cancer Disease Prediction application is an end user support and online consultation project. Here, we propose a web application that allows users to get instant guidance on their cancer disease through an intelligent system online. The application is fed with various details and the cancer disease associated with those details. The application allows user to share their health related issues for cancer prediction. It then processes user specific details to check for various illness that could be associated with it. Here we use some intelligent data mining techniques to guess the most accurate illness that could be associated with…
Read More

Symptom Based Clinical Document Clustering by Matrix Factorization

Data mining, Web | Desktop Application
Symptom Based Clinical Document Clustering by Matrix Factorization Here we proposed a Doctor’s clinic management kind of system, where patients will visit the clinic and if the patient is new to clinic, then receptionist will feed his/her details into the system else if the patient is already registered then the receptionist will search for the patient’s name and add into the queue. When the patients turn appear, doctor will be able to see his/her details and also can check details about previous conditions if any. After doctor sees the patient, he will make entry of medicines patient needs to take. Receptionist will get those details, and there will be a section to add symptom. So at the end we will have list of symptom and medication provided for that. On…
Read More

Using Data Mining To Improve Consumer Retailer Connectivity

Data mining, Web | Desktop Application
Using Data Mining To Improve Consumer Retailer Connectivity Many consumers prefer online shopping. Day-to-day busy schedule made many consumers to visit online e-commerce websites for shopping. This saves time and cost of the consumer. With the growth of the e-commerce websites retailers tend to fail to attract more and more consumers. Consumers no longer feel difference between e-shopping and offline shopping. We proposed a system of connecting the consumer and the retailer. This system creates a bridge between consumer and retailer. We had implemented an effective data mining algorithm to analyze new patterns and trends. This system will gather data from the customer behavior pattern and is supplied to the retailers, so that retailers will able to know the new patterns and trends. With these information retailer can approach targeted…
Read More

E Commerce Product Rating Based On Customer Review Mining

Data mining, Web | Desktop Application
E Commerce Product Rating Based On Customer Review Mining Many users purchase products through E-commerce websites. Because of online shopping, E-commerce enterprises were unable to trace customer satisfaction for the services provided by the firm. This gave rise to an idea of a system where various customers give reviews about the product and online shopping services, which in turn help the E-commerce enterprises and manufacturers to get customer opinion to improve service and merchandise through mining customer reviews. System uses an algorithm to track and manage customer reviews, through mining topics and sentiment orientation from online customer reviews. In this system user will view and purchase products online. In addition, the Customer will give a review about the merchandise and online shopping services. Certain keywords mentioned in the customer review…
Read More

Opinion Mining For Restaurant Reviews

Data mining, Web | Desktop Application
Opinion Mining For Restaurant Reviews This system rates any particular restaurant by detecting hidden sentiments in the feedback received by its customers. The system uses opinion-mining methodology in order to achieve desired functionality. Opinion Mining for Restaurant Reviews is a web application, which takes feedback of various users, and based on the opinion, system will specify whether the posted restaurant is good, bad, or worst. Database of the system have various keywords denoted as negative and positive words, which helps the system to recognize and match the feedback and rank them accordingly. The role of the admin is to post new restaurant and adds keywords in database. This application acts as a boon for food lovers and works as a source advertisements because it makes people aware about the services…
Read More

Smart Health Prediction Using Data Mining php

Data mining, Web | Desktop Application
Smart Health Prediction Using Data Mining The Health Prediction system is an end user support and online consultation project. This system allows users to get instant guidance on their health issues through an intelligent health care system online. The system contains data of various symptoms and the disease/illness associated with those symptoms. It also has an option for users of sharing their symptoms and issues. The system processes those symptoms to check for various illnesses that can be associated with it. The system is designed to use intelligent data mining techniques to guess the most accurate illness based on patient’s symptoms. If user’s symptoms do not exactly match any disease in the database, then it is shows the diseases user could probably have based on his/her symptoms. It also consists…
Read More

Evaluation of Academic Performance of Students with Fuzzy Logic

Data mining, Web | Desktop Application
Evaluation of Academic Performance of Students with Fuzzy Logic Students’ academic success is evaluated by their performance in exams conducted by the institutes or Universities. This system evaluate students academic performance with fuzzy logic based performance evaluation method. In this method, we consider three parameters attendance, internal marks and external marks which are considered to evaluate students final academic performance. The fuzzy inference system has also been used to obtain Performance of Students for different input values student attendance, marks.
Read More

Crime Rate Prediction Using K Means

Data mining, Web | Desktop Application
Crime Rate Prediction Using K Means Crime rate is increasing now-a-days in many countries. In today’s world with such higher crime rate and brutal crime happening, there must be some protection against this crime. Here we introduced a system by which crime rate can be reduced. Crime data must be fed into the system. We introduced data mining algorithm to predict crime. K-means algorithm plays an important role in analyzing and predicting crimes. K-means algorithm will cluster co-offenders, collaboration and dissolution of organized crime groups, identifying various relevant crime patterns, hidden links, link prediction and statistical analysis of crime data. This system will prevent crime occurring in society. Crime data is analyzed which is stored in the database. Data mining algorithm will extract information and patterns from database. System will…
Read More

Predicting User Behavior Through Sessions Web Mining

Data mining, Web | Desktop Application
Predicting User Behavior Through Sessions Web Mining It is the method to extract the user sessions from the created session file. And depending on the sessions created the user behaviour is predicted by displaying them most visited page or the product. Usability is defined as the satisfaction, efficiency and effectiveness with which specific users can complete specific tasks in a particular environment. This process includes 3 stages, namely Data cleaning, User identification, Session identification. In this paper, we are implementing these three phases. Depending upon the frequency of users visiting each page mining is performed. By finding the session of the user we can analyze the user behaviour by the time spend on a particular page.
Read More

Movie Success Prediction Using Data Mining PHP

Data mining, Web | Desktop Application
Movie Success Prediction Using Data Mining PHP In this system we have developed a mathematical model for predicting the success class such as flop, hit, super hit of the movies. For doing this we have to develop a methodology in which the historical data of each component such as actor, actress, director, music that influences the success or failure of a movie is given is due to weight age and then based on multiple thresholds calculated on the basis of descriptive statistics of dataset of each component it is given class flop, hit, super hit label. Admin will add the film crew data. Admin will add movies data of a particular film crew. Admin will add new movie data along with film crew details as well as release date of…
Read More

A Commodity Search System For Online Shopping Using Web Mining

Data mining, Web | Desktop Application
A Commodity Search System For Online Shopping Using Web Mining With the popularity of Internet and e-commerce, the number of shopping websites has rapidly increased on the Internet, and this enables people to shop easily through the Internet. Consumers spend a lot of time searching commodity, because they need to filter and compare search results data by themselves. In recent years, there is a growing parity websites helping consumers to buy cheaper commodity. Although these websites can help consumers get the parity price of commodities, the search results are not so ideal. Because these websites may occur problems about the difference commodity between search results and consumers want to search, or the difference commodity price between search results and commodity web page. Therefore, this study attempts to use web mining…
Read More

Real Estate Search Based On Data Mining

Data mining, Web | Desktop Application
Real Estate Search Based On Data Mining This project helps the users to make good decisions regarding buying or selling of valuable property. Prior to this online system, this process involved a lot of travelling costs and searching time. Due to this system, the users can search for the required property online and get the property details depending on its preferences, which save both time and energy. This system includes property details like Address, space measurement (sq ft), number of BHKs, Floor, Property Seller name and its contact number plus email-id. The user can search property depending on the area that it wants in, number of washrooms, bedrooms, halls and kitchen. The system contains an algorithm that calculates the amount of loan that the user can take along with the…
Read More

Smart Health Prediction Using Data Mining

Data mining, Web | Desktop Application
Smart Health Prediction Using Data Mining The Health Prediction system is an end user support and online consultation project. This system allows users to get instant guidance on their health issues through an intelligent health care system online. The system contains data of various symptoms and the disease/illness associated with those symptoms. It also has an option for users of sharing their symptoms and issues. The system processes those symptoms to check for various illnesses that can be associated with it. The system is designed to use intelligent data mining techniques to guess the most accurate illness based on patient’s symptoms. If user’s symptoms do not exactly match any disease in the database, then it is shows the diseases user could probably have based on his/her symptoms. It also consists…
Read More

Secure Mining of Association Rules in Horizontally Distributed Databases

Data mining
Secure Mining of Association Rules in Horizontally Distributed Databases We propose a protocol for secure mining of association rules in horizontally distributed databases. The current leading protocol is that of Kantarcioglu and Clifton [18]. Our protocol, like theirs, is based on the Fast Distributed Mining (FDM) algorithm of Cheung et al. [8], which is an unsecured distributed version of the Apriori algorithm. The main ingredients in our protocol are two novel secure multi-party  algorithms—one that computes the union of private subsets that each of the interacting players hold, and another that tests the inclusion of an element held by one player in a subset held by another. Our protocol offers enhanced privacy with respect to the protocol in [18]. In addition, it is simpler and is significantly more efficient in…
Read More

Policy-by-Example for Online Social Networks.

Cloud Computing, Data mining, Security and Encryption
Policy-by-Example for Online Social Networks We introduce two approaches for improving privacy policy management in online social networks. First, we introduce a mechanism using proven clustering techniques that assists users in grouping their friends for group based policy management approaches. Second, we introduce a policy management approach that leverages a user's memory and opinion of their friends to set policies for other similar friends. We refer to this new approach as Same-As Policy Management. To demonstrate the e ectiveness of our policy management improvements, we implemented a prototype Facebook application and conducted an extensive user study. Leveraging proven clustering techniques, we demonstrated a 23% reduction in friend grouping time. In addition, we demonstratedconsiderable reductions in policy authoring time using Same As Policy Management over traditional group based policy management approaches. Finally,…
Read More

Web Usage Mining Using Improved Frequent Pattern Tree Algorithms

Cloud Computing, Data mining
Web Usage Mining Using Improved Frequent Pattern Tree Algorithms Web mining can be broadly defined as discovery and analysis of useful information from the World Wide Web. Web Usage Mining can be described as the discovery and analysis of user accessibility pattern, during the mining of log files and associated data from a particular Web site, in order to realize and better serve the needs of Web-based applications. Web usage mining itself can be categorised further depending on the kind of usage data considered they are web server, application server and application level data. This Research work focuses on web use mining and specifically keeps tabs on running across the web utilization examples of sites from the server log records. The bonding of memory and time usage is compared by…
Read More

An Efficient Certificateless Encryption for Secure Data Sharing in Public Clouds (Data Mining with cloud)

Data mining
An Efficient Certificateless Encryption for Secure Data Sharing in Public Clouds (Data Mining with cloud) We propose a mediated certificateless encryption scheme without pairing operations for securely sharing sensitive information in public clouds. Mediated certificateless public key encryption (mCL-PKE) solves the key escrow problem in identity based encryption and certificate revocation problem in public key cryptography. However, existing mCL-PKE schemes are either inefficient because of the use of expensive pairing operations or vulnerable against partial decryption attacks. In order to address the performance and security issues, in this paper, we first propose a mCL-PKE scheme without using pairing operations. We apply our mCL-PKE scheme to construct a practical solution to the problem of sharing sensitive information in public clouds. The cloud is employed as a secure storage as well as…
Read More

Infrequent Weighted Itemset Mining Using Frequent Pattern Growth

Data mining
Infrequent Weighted Itemset Mining Using Frequent Pattern Growth Frequent weighted itemsets represent correlations frequently holding in data in which items may weight differently. However, in some contexts, e.g., when the need is to minimize a certain cost function, discovering rare data correlations is more interesting than mining frequent ones. This paper tackles the issue of discovering rare and weighted itemsets, i.e., the infrequent weighted itemset (IWI) mining problem. Two novel quality measures are proposed to drive the IWI mining process. Furthermore, two algorithms that perform IWI and Minimal IWI mining efficiently,driven by the proposed measures, are presented. Experimentalresults show efficiency and effectiveness of the proposed approach.
Read More

Reversible Data Hiding With Optimal Value Transfer

Cloud Computing, Data mining
Reversible Data Hiding With Optimal Value Transfer In reversible data hiding techniques, the values of host data are modified according to some particular rules and the original host content can be perfectly restored after extraction of the hidden data on receiver side. In this paper, the optimal rule of value modification under a payload -distortion criterion is found by using an iterative procedure, and a practical reversible data hiding scheme is proposed. The secret data, as well as the auxiliary information used for content recovery, are carried by the differences between the original pixel-values and the corresponding values estimated from the neighbours. Here, the estimation errors are modified according to the optimal value transfer rule. Also, the host image is divided into a number of pixel subsets and the auxiliary…
Read More

Public auditing cloud data storage- bilinear pairing

Cloud Computing, Data mining
Public auditing cloud data storage- bilinear pairing. Cloud data security is concern for the client while using the cloud services provided by the service provider. In this paper we are analyzed various mechanisms to ensure reliable data storage using cloud services. It mainly focuses on the way of providing computing resources in form of service rather than a product and utilities are provided to users over internet. In the cloud, application and services move to centralized huge data center and services and management of this data may not be trustworthy into cloud environment the computing resources are under control of service provider and the third-party-auditor ensures the data integrity over out sourced data. Third-party-auditor not only read but also may be change the data. Therefore a mechanism should be provided…
Read More

Optimization of Horizontal Aggregation in SQL by Using K-Means Clustering.

Cloud Computing, Data mining
optimization of Horizontal Aggregation in SQL by Using K-Means Clustering. To analyze data efficiently, Data mining systems are widely using datasets with columns in horizontal tabular layout. Preparing a data set is more complex task in a data mining project, requires many SQL queries, joining tables and aggregating columns. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout returns set of numbers, instead of one number per row. The system uses one parent table and different child tables, operations are then performed on the data loaded from multiple tables. PIVOT operator, offered by RDBMS is used to calculate aggregate operations. PIVOT method is much faster method and offers much scalability. Partitioning large set of data, obtained from the result of horizontal aggregation, in to…
Read More

Interpreting the Public Sentiment Variations on Twitter

Data mining
Interpreting the Public Sentiment Variations on Twitter Millions of users share their opinions on Twitter, making it a valuable platform for tracking and analyzing public sentiment. Such tracking and analysis can provide critical information for decision making in various domains. Therefore it has attracted attention in both academia and industry. Previous research mainly focused on modeling and tracking public sentiment. In this work, we move one step further to interpret sentiment variations. We observed that emerging topics(named foreground topics) within the sentiment variation periods are highly related to the genuine reasons behind the variations. Based on this observation, we propose a Latent Dirichlet Allocation (LDA) based model, Foreground and Background LDA (FB-LDA), to distill foreground topics and filter out lngstanding background topics. These foreground topics can give potential interpretations of…
Read More

Product Aspect Ranking and Its Applications

Data mining
Product Aspect Ranking and Its Applications Numerous consumer reviews of products are now available on the Internet. Consumer reviews contain rich and valuable knowledge for both firms and users.However,the reviews are often disorganized, leading to difficulties in information navigation and knowledge acquisition. This article proposes a product aspect ranking framework, which automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability of the numerous reviews. The important product aspects are identified based on two observations: 1) the important aspects are usually commented on by a large number of consumers and 2) consumer opinions on the important aspects greatly influence their overall opinions on the product. In particular, given the consumer reviews of a product, we first identify product aspects by a shallow dependency parser…
Read More

Supporting Privacy Protection in Personalized Web Search

Data mining, Web | Desktop Application
Supporting Privacy Protection in Personalized Web Search Personalized web search (PWS) has demonstrated its effectiveness in improving the quality of various search services on the Internet. However, evidences show that users’ reluctance to disclose their private information during search has become a major barrier for the wide proliferation of PWS. We study privacy protection in PWS applications that model user preferences as hierarchical user profiles. We propose a PWS framework called UPS that can adaptively generalize profiles by queries while respecting user specified privacy requirements. Our runtime generalization aims at striking a balance between two predictive metrics that evaluate the utility of personalization and the privacy risk of exposing the generalized profile. We present two greedy algorithms, namely GreedyDP and GreedyIL, for runtime generalization. We also provide an online prediction…
Read More

Set Predicates in SQL: Enabling Set- Level Comparisons for Dynamically Formed Groups

Data mining, Web | Desktop Application
Set Predicates in SQL: Enabling Set- Level Comparisons for Dynamically Formed Groups In data warehousing and OLAP applications, scalar level predicates in SQL become increasingly inadequate to support a class of operations that require set-level comparison semantics, i.e., comparing a group of tuples with multiple values. Currently, complex SQL queries composed by scalar-level operations are often formed to obtain even very simple set-level semantics. Such queries are not only difficult to write but also challenging for a database engine to optimize, thus can result in costly evaluation. This paper proposes to augment SQL with set predicate, to bring out otherwise obscured set-level semantics. We studied two approaches to processing set predicates—an aggregate function-based approach and a bitmap index-based approach. Moreover, we designed a histogram-based probabilistic method of set predicate selectivity…
Read More

Context-Based Diversification for Keyword Queries Over XML Data

Data mining, Web | Desktop Application
Context-Based Diversification for Keyword Queries Over XML Data While keyword query empowers ordinary users to search vast amount of data, the ambiguity of keyword query makes it difficult to effectively answer keyword queries, especially for short and vague keyword queries. To address this challenging problem, in this paper we propose an approach that automatically diversifies XML keyword search based on its different contexts in the XML data. Given a short and vague keyword query and XML data to be searched, we first derive keyword search candidates of the query by a simple feature selection model. And then, we design an effective XML keyword search diversification model to measure the quality of each candidate. After that, two efficient algorithms are proposed to incrementally compute top-k qualified query candidates as the diversified search intentions. Two selection criteria are targeted: the k selected query candidates are most relevant to…
Read More

Customizable Pointof- Interest Queries in Road Networks

Data mining, Web | Desktop Application
Customizable Pointof- Interest Queries in Road Networks networks within interactive applications. We show that partition-based algorithms developed for point-topoint shortest path computations can be naturally extended to handle augmented queries such as finding the closest restaurant or the best post office to stop on the way home, always ranking POIs according to a user-defined cost function. Our solution allows different trade-offs between indexing effort (time and space) and query time. Our most flexible variant allows the road network to change frequently (to account for traffic information or personalized cost functions) and the set of POIs to be specified at query time. Even in this fully dynamic scenario, our solution is fast enough for interactive applications on continental road networks.
Read More

Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions

Data mining, Web | Desktop Application
Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions The large number of potential applications from bridging web data with knowledge bases has led to an increase in the entity linking research. Entity linking is the task to link entity mentions in text with their corresponding entities in a knowledge base. Potential applications include information extraction, information retrieval, and knowledge base population. However, this task is challenging due to name variations and entity ambiguity. In this survey, we present a thorough overview and analysis of the main approaches to entity linking, and discuss various applications, the evaluation of entity linking systems, and future directions.
Read More

Tweet Segmentation and Its Application to Named Entity Recognition

Data mining, Web | Desktop Application
Tweet Segmentation and Its Application to Named Entity Recognition Twitter has attracted millions of users to share and disseminate most up-to-date information, resulting in large volumes of data produced everyday. However, many applications in Information Retrieval (IR) and Natural Language Processing (NLP) suffer severely from the noisy and short nature of tweets. In this paper, we propose a novel framework for tweet segmentation in a batch mode, called HybridSeg . By splitting tweets into meaningful segments, the semantic or context information is well preserved and easily extracted by the downstream applications. HybridSeg finds the optimal segmentation of a tweet by maximizing the sum of the stickiness scores of its candidate segments. The stickiness score considers the probability of a segment being a phrase in English (i.e., global context) and the probability of a segment being a phrase within the batch of tweets (i.e., local…
Read More

Co-Extracting Opinion Targets and Opinion Words from Online Reviews Based on the Word Alignment Model

Data mining, Web | Desktop Application
Co-Extracting Opinion Targets and Opinion Words from Online Reviews Based on the Word Alignment Model Mining opinion targets and opinion words from online reviews are important tasks for fine-grained opinion mining, the key component of which involves detecting opinion relations among words. To this end, this paper proposes a novel approach based on the partially supervised alignment model, which regards identifying opinion relations as an alignment process. Then, a graph-based co-ranking algorithm is exploited to estimate the confidence of each candidate. Finally, candidates with higher confidence are extracted as opinion targets or opinion words. Compared to previous methods based on the nearest-neighbor rules, our model captures opinion relations more precisely, especially for long-span relations. Compared to syntaxbased methods, our word alignment model effectively alleviates the negative effects of parsing errors…
Read More

Polarity Consistency Checking for Domain Independent Sentiment Dictionaries

Data mining, Web | Desktop Application
Polarity Consistency Checking for Domain Independent Sentiment Dictionaries Polarity classification of words is important for applications such as Opinion Mining and Sentiment Analysis. A number of sentiment word/sense dictionaries have been manually or (semi)automatically constructed. We notice that these sentiment dictionaries have numerous inaccuracies. Besides obvious instances, where the same word appears with different polarities in different dictionaries, the dictionaries exhibit complex cases of polarity inconsistency, which cannot be detected by mere manual inspection. We introduce the concept of polarity consistency of words/senses in sentiment dictionaries in this paper. We show that the consistency problem is NP-complete. We reduce the polarity consistency problem to the satisfiability problem and utilize two fast SAT solvers to detect inconsistencies in a sentiment dictionary. We perform experiments on five sentiment dictionaries and WordNet to show inter- and intra-dictionaries inconsistencies.
Read More

RRW—A Robust and Reversible Watermarking Technique for Relational Data

Data mining, Web | Desktop Application
RRW—A Robust and Reversible Watermarking Technique for Relational Data Advancement in information technology is playing an increasing role in the use of information systems comprising relational databases. These databases are used effectively in collaborative environments for information extraction; consequently, they are vulnerable to security threats concerning ownership rights and data tampering. Watermarking is advocated to enforce ownership rights over shared relational data and for providing a means for tackling data tampering. When ownership rights are enforced using watermarking, the underlying data undergoes certain modifications; as a result of which, the data quality gets compromised. Reversible watermarking is employed to ensure data quality along-with data recovery. However, such techniques are usually not robust against malicious attacks and do not provide any mechanism to selectively watermark a particular attribute by taking into account its role in knowledge discovery. Therefore, reversible watermarking is required that ensures; (i) watermark encoding and decoding by…
Read More

Product Aspect Ranking and Its Applications

Cloud Computing, Data mining, Security and Encryption, Web | Desktop Application
Product Aspect Ranking and Its Applications Numerous consumer reviews of products are now available on the Internet. Consumer reviews contain rich and valuable knowledge for both firms and users. However, the reviews are often disorganized, leading to difficulties in information navigation and knowledge acquisition. This article proposes a product aspect ranking framework, which automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability of the numerous reviews. The important product aspects are identified based on two observations: 1) the important aspects are usually commented on by a large number of consumers and 2) consumer opinions on the important aspects greatly influence their overall opinions on the product. In particular, given the consumer reviews of a product, we first identify product aspects by a shallow…
Read More

Typicality-Based Collaborative Filtering Recommendation

Cloud Computing, Data mining, Security and Encryption
Typicality-Based Collaborative Filtering Recommendation Collaborative filtering (CF) is an important and popular technology for recommender systems. However, current CF methods suffer from such problems as data sparsity, recommendation inaccuracy, and big-error in predictions. In this paper, we borrow ideas of object typicality from cognitive psychology and propose a novel typicality-based collaborative filtering recommendation method named TyCo. A distinct feature of typicality-based CF is that it finds “neighbors” of users based on user typicality degrees in user groups (instead of the corated items of users, or common users of items, as in traditional CF). To the best of our knowledge, there has been no prior work on investigating CF recommendation by combining object typicality. TyCo outperforms many CF recommendation methods on recommendation accuracy (in terms of MAE) with an improvement of…
Read More

Panda: Public Auditing for Shared Data with Efficient User Revocation in the Cloud

Cloud Computing, Data mining, Parallel And Distributed System, Security and Encryption, Web | Desktop Application
Panda: Public Auditing for Shared Data with Efficient User Revocation in the Cloud With data storage and sharing services in the cloud, users can easily modify and share data as a group. To ensure shared data integrity can be verified publicly, users in the group need to compute signatures on all the blocks in shared data. Different blocks in shared data are generally signed by different users due to data modifications performed by different users. For security reasons, once a user is revoked from the group, the blocks which were previously signed by this revoked user must be re-signed by an existing user. The straightforward method, which allows an existing user to download the corresponding part of shared data and re-sign it during user revocation, is inefficient due to the…
Read More