Gábor Takács

57209726779

Publications - 14

A dynamic programming approach for 4D flight route optimization

Publication Name: Proceedings 2014 IEEE International Conference on Big Data IEEE Big Data 2014

Publication Date: 2014-01-01

Volume: Unknown

Issue: Unknown

Page Range: 24-28

Description:

This paper describes our solution for the GE Flight Quest 2 (FQ2) challenge, organized by Kaggle. FQ2 aimed at optimizing flight routes so that the overall cost depending on fuel consumption and delay is as low as possible. The contestants could use several data tables as inputs, including aircraft positions and destinations, weather information and other aviation related data. Their task was to produce a flight plan for each flight, given as a list of (latitude, longitude, altitude, airspeed) quadruplets. The cost of the flight plans was evaluated with an open source simulator. Our proposed method produces an initial solution with the Dijkstra's algorithm to avoid restricted zones, and then refines it using dynamic programming and local search techniques. We can extensively utilize wind forecasts and significantly divert the planes from the the great circle route if necessary. Moreover, our method tries to set the ascending and descending profiles of the flights to further decrease the cost. Our algorithm achieved second place on the public, and fifth place on the private leaderboard of the contest.

Open Access: Yes

DOI: 10.1109/BigData.2014.7004427

Predicting flight arrival times with a multistage model

Publication Name: Proceedings 2014 IEEE International Conference on Big Data IEEE Big Data 2014

Publication Date: 2014-01-01

Volume: Unknown

Issue: Unknown

Page Range: 78-84

Description:

Airlines are constantly looking for ways to cut flight delays, in order to enhance service quality and reduce operational costs. The goal of the data science contest, GE Flight Quest (https://www.gequest.com/c/flight), was to make flights more efficient by improving the accuracy of arrival time estimates. The data set of the contest was 128 GB in size and contained 252 data columns arranged in 34 tables. This paper presents my solution that won third prize under team name Taki. The solution employs a 6-stage model consisting of successive ridge regressions and gradient boosting machines, built on 56 features constructed from the raw data. The hardware environment used for training and running the model was a 64 core machine with 1 terabyte of memory.

Open Access: Yes

DOI: 10.1109/BigData.2014.7004435

Reducing packaging waste by GIS applications

Publication Name: Green Design Materials and Manufacturing Processes Proceedings of the 2nd International Conference on Sustainable Intelligent Manufacturing Sim 2013

Publication Date: 2013-01-01

Volume: Unknown

Issue: Unknown

Page Range: 443-447

Description:

In the 21st century, the role of electronic gadgets has reached an unexpected limit where they can literally support every aspect of our lives. Not only have the variety of their knowledge-base and the services they provide reached surprising high levels which are now able to support mobility around the globe, but also these gadgets have created new possibilities to measure, collect and save data. The purpose of this article is to present current packaging methods, but also to investigate and specify the actual stress on our products so that we can quantify the real exposure we want to avoid. The importance of this investigation is that if we could find a method to measure and quantify the correct data of the actual stress, it would provide a effective base for reducing the quantity of the packaging materials used. This is especially the case of those products transported on pallets and therefore we could significantly reduce not only the quantity of the base material used for packaging but also the amount of the left over waste resulting from packaging material globally. As an addition to all the above mentioned factors, the topic also gives an interesting perspective on how the savings on the packaging materials could reduce the cost of these products and so provide companies a way to serve their clients in a more cost effective way and an option to offer their products for a much cheaper price and hence gain a larger market share. © 2013 Taylor & Francis Group.

Open Access: Yes

DOI: 10.1201/b15002-86

Visualization of movie features in collaborative filtering

Publication Name: Somet 2013 12th IEEE International Conference on Intelligent Software Methodologies Tools and Techniques Proceedings

Publication Date: 2013-01-01

Volume: Unknown

Issue: Unknown

Page Range: 229-233

Description:

In this paper we will describe a modification of the matrix factorization (MF) algorithm which allows visualizing the user and item characteristics. When applying MF for collaborative filtering, we get a model that represents the attributes of users and items by feature vectors. Some elements of these vectors may have understandable meaning for humans but due to the lack of internal connections between the feature vectors, these are difficult to visualize. In this paper we give a detailed description of a MF method enabling better visualization of features by arranging them into a 2D map, where via the calculation of the feature values we try to position features with similar 'meaning' close to each other. To achieve this first we define a neighborhood relation on features, then we modify the MF so that we introduce a new term in the error function which penalize the difference between the neighbor features. We show that this modification slightly decrease the accuracy of the model but we get well visualized feature maps. On the feature maps meanings can be associated with regions, and so we can provide an interesting explanation for the user why he/she was recommended the movie. Such plausible explanations may result in that users will better understand how the system works, which can also increase customer loyalty towards the service provider. © 2013 IEEE.

Open Access: Yes

DOI: 10.1109/SoMeT.2013.6645674

Alternating least squares for personalized ranking

Publication Name: Recsys 12 Proceedings of the 6th ACM Conference on Recommender Systems

Publication Date: 2012-10-17

Volume: Unknown

Issue: Unknown

Page Range: 83-90

Description:

Two avors of the recommendation problem are the explicit and the implicit feedback settings. In the explicit feedback case, users rate items and the user item preference relationship can be modelled on the basis of the ratings. In the harder but more common implicit feedback case, the system has to infer user preferences from indirect information: presence or absence of events, such as a user viewed an item. One approach for handling implicit feedback is to minimize a ranking objective function instead of the conventional prediction mean squared error. The naive minimization of a ranking objective function is typically expensive. This difficulty is usually overcome by a trade-off: sacrificing the accuracy to some extent for computational efficiency by sampling the objective function. In this paper, we present a computationally effective approach for the direct minimization of a ranking objective function, without sampling. We demonstrate by experiments on the Y!Music and Netix data sets that the proposed method outperforms other implicit feedback recommenders in many cases in terms of the ErrorRate, ARP and Recall evaluation metrics. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM).

Open Access: Yes

DOI: 10.1145/2365952.2365972

Performance analysis of DNS64 and NAT64 solutions

Publication Name: Infocommunications Journal

Publication Date: 2012-06-01

Volume: 4

Issue: 2

Page Range: 29-36

Description:

The need for DNS64 and NAT64 solutions is introduced and their operation is presented. A test environment for the performance analysis of DNS64 and NAT64 implementations is described. The resource requirements of the implementations are measured. The performance of DNS64 and NAT64 solutions is measured under heavy load conditions to determine if they are safe to be used in a production environment, like the network of an internet service provider.

Open Access: Yes

DOI: DOI not available

Predictor set optimization for collaborative filtering

Publication Name: Proceedings of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems Technology and Applications Idaacs 2011

Publication Date: 2011-12-12

Volume: 1

Issue: Unknown

Page Range: 404-407

Description:

One of the most efficient approaches to create a recommender system is collaborative filtering (CF). CF does not require metadata about users and items, but only interactions between users and items (e.g. ratings), therefore it can be applied in many problem domains. Experience shows that for achieving high accuracy, it is worthwhile to use a blended solution, consisting of many predictors. This paper presents an algorithm for constructing a set of CF predictors so that the overall accuracy of the set is high. The algorithm was tested on the Netflix Prize dataset that contains 100 million ratings. © 2011 IEEE.

Open Access: Yes

DOI: 10.1109/IDAACS.2011.6072784

Applications of the conjugate gradient method for implicit feedback collaborative filtering

Publication Name: Recsys 11 Proceedings of the 5th ACM Conference on Recommender Systems

Publication Date: 2011-12-06

Volume: Unknown

Issue: Unknown

Page Range: 297-300

Description:

The need for solving weighted ridge regression (WRR) problems arises in a number of collaborative filtering (CF) algorithms. Often, there is not enough time to calculate the exact solution of the WRR problem, or it is not required. The conjugate gradient (CG) method is a state-of-the-art approach for the approximate solution of WRR problems. In this paper, we investigate some applications of the CG method for new and existing implicit feedback CF models. We demonstrate through experiments on the Netflix dataset that CG can be an efficient tool for training implicit feedback CF models. © 2011 ACM.

Open Access: Yes

DOI: 10.1145/2043932.2043987

Fast tomographic reconstruction on parallel hardware

Publication Name: Aip Conference Proceedings

Publication Date: 2010-12-01

Volume: 1281

Issue: Unknown

Page Range: 1793-1796

Description:

Tomographic reconstruction is the mathematical procedure of approximating a function f, based on the integrals of f along a set of line sections. The need for fast tomographic reconstruction arises for example in the challenging problem of real time control of some plasma parameters in a fusion reactor. In this paper, we present a fast algorithm for tomographic reconstruction. A good property of our approach is that it fits well to hardware with two levels of parallelism (e.g. a GPU cluster). We also propose an objective evaluation method for measuring the quality of reconstruction on real datasets where f is unknown. We will demonstrate that our algorithm is able to perform more than 50 000 reconstructions per second at reasonably good quality, running on a relatively cheap hardware. © 2010 American Institute of Physics.

Open Access: Yes

DOI: 10.1063/1.3498232

Scalable collaborative filtering approaches for large reeommender systems

Publication Name: Journal of Machine Learning Research

Publication Date: 2009-01-01

Volume: 10

Issue: Unknown

Page Range: 623-656

Description:

The collaborative filtering (CF) using known user ratings of items has proved to be effective for predicting user preferences in item selection. This thriving subfield of machine learning became popular in the late 1990s with the spread of online services that use recommender systems, such as Amazon, Yahoo! Music, and Netflix. CF approaches are usually designed to work on very large data sets. Therefore the scalability of the methods is crucial. In this work, we propose various scalable solutions that are validated against the Netflix Prize data set, currently the largest publicly available collection. First, we propose various matrix factorization (MF) based techniques. Second, a neighbor correction method for MF is outlined, which alloys the global perspective of MF and the localized property of neighbor based approaches efficiently. In the experimentation section, we first report on some implementation issues, and we suggest on how parameter optimization can be performed efficiently for MFs. We then show that the proposed scalable approaches compare favorably with existing ones in terms of prediction accuracy and/or required training time. Finally, we report on some experiments performed on MovieLens and Jester data sets.

Open Access: Yes

DOI: DOI not available

A unified approach of factor models and neighbor based methods for large recommender systems

Publication Name: 1st International Conference on the Applications of Digital Information and Web Technologies Icadiwt 2008

Publication Date: 2008-12-30

Volume: Unknown

Issue: Unknown

Page Range: 186-191

Description:

Matrix factorization (MF) based approaches have proven to be efficient for rating-based recommendation systems. In this paper, we propose a hybrid approach that alloys an improved MF and the so-called NSVD1 approach, resulting in a very accurate factor model. After that, we propose a unification of factor models and neighbor based approaches, which further improves the performance. The approaches are evaluated on the Netflix Prize dataset, and they provide very low RMSE, and favorable running time. Our best solution presented here with Quiz RMSE 0.8851 outperforms all published single methods in the literature. ©2008 IEEE.

Open Access: Yes

DOI: 10.1109/ICADIWT.2008.4664342

Matrix factorization and neighbor based algorithms for the netflix prize problem

Publication Name: Recsys 08 Proceedings of the 2008 ACM Conference on Recommender Systems

Publication Date: 2008-12-01

Volume: Unknown

Issue: Unknown

Page Range: 267-274

Description:

Collaborative filtering (CF) approaches proved to be effective for recommender systems in predicting user preferences in item selection using known user ratings of items. This subfield of machine learning has gained a lot of popularity with the Netix Prize competition started in October 2006. Two major approaches for this problem are matrix factorization (MF) and the neighbor based approach (NB). In this work, we propose various variants of MF and NB that can boost the performance of the usual ensemble based scheme. First, we investigate various regularization scenarios for MF. Second, we introduce two NB methods: one is based on correlation coeficients and the other on linear least squares. At the experimentation part, we show that the proposed approaches compare favorably with existing ones in terms of prediction accuracy and/or required training time. We present results of blending the proposed methods. © 2008 ACM.

Open Access: Yes

DOI: 10.1145/1454008.1454049

Investigation of various matrix factorization methods for large recommender systems

Publication Name: Proceedings IEEE International Conference on Data Mining Workshops Icdm Workshops 2008

Publication Date: 2008-12-01

Volume: Unknown

Issue: Unknown

Page Range: 553-562

Description:

Matrix Factorization (MF) based approaches have proven to be efficient for rating-based recommendation systems. In this work, we propose several matrix factorization approaches with improved prediction accuracy. We introduce a novel and fast (semi)-positive MF approach that approximates the features by using positive values for either users or items. We describe a momentum-based MF approach. A transductive version of MF is also introduced, which uses information from test instances (namely the ratings users have given for certain items) to improve prediction accuracy. We describe an incremental variant of MF that efficiently handles new users/ratings, which is crucial in a real-life recommender system. A hybrid MF-neighborbased method is also discussed that further improves the performance of MF. The proposed methods are evaluated on the Netflix Prize dataset, and we show that they can achieve very favorable Quiz RMSE (best single method: 0.8904, combination: 0.8841) and running time. © 2008 IEEE.

Open Access: Yes

DOI: 10.1109/ICDMW.2008.86

Investigation of various matrix factorization methods for large recommender systems

Publication Name: Proceedings of the 2nd Kdd Workshop on Large Scale Recommender Systems and the Netflix Prize Competition Netflix 08

Publication Date: 2008-12-01

Volume: Unknown

Issue: Unknown

Page Range: Unknown

Description:

Matrix Factorization (MF) based approaches have proven to be efficient for rating-based recommendation systems. In this work, we propose several matrix factorization approaches with improved prediction accuracy. We introduce a novel and fast (semi)-positive MF approach that approximates the features by using positive values for either users or items. We describe a momentum-based MF approach. A transductive version of MF is also introduced, which uses information from test instances (namely the ratings users have given for certain items) to improve prediction accuracy. We describe an incremental variant of MF that efficiently handles new users/ratings, which is crucial in a real-life recommender system. A hybrid MF - neighbor-based method is also discussed that further improves the performance of MF. The proposed methods are evaluated on the Netflix Prize dataset, and we show that they can achieve very favorable Quiz RMSE (best single method: 0.8904, combination: 0.8841) and running time. Copyright 2008 ACM.

Open Access: Yes

DOI: 10.1145/1722149.1722155