11-12 of October 2016 Holiday Inn, Belgrade


About the Conference


Data Science Conference is first-open Conference dedicated to Data Science on Balkan. When we started, we wanted to make some impact, some crucial changes in Data Science scene in Serbia. We wanted to make high-level event and our role models were Conferences such as PyData & Strata + Hadoop. Nowadays, goals of the Conference are to promote Data Science in Balkan, to foster networking in the community and to help Data Science ecosystem to develop itself.

"We choose Data Science because it is everywhere around us, effecting our lives on daily basis". The field of Data Science is evolving into one of the fastest-growing and most in-demand fields in the world. Organizations across industries are looking to make sense of the data they can now collect from new technologies – from predicting the next hot product to determining the risk of an infectious disease outbreak.

Team behind the Conference is Institute of Contemporary Sciences. You can find more about our work on www.isn.rs.

This year we divided our speakers in two parallel tracks: Big Data & Machine Learning track. Topics that will be covered this year are:

Machine Learning - such as, but not only: Natural Language Processing, Deep Learning, Neural Networks, Bayesian Non-Parametrics, Topic Models, Probabilistic Programming, Machine Learning Tools in practice & cetera;

Big Data - such as, but not only: Big Data Models and Algorithms, Big Data Management, Quality of Big Data Services, Big Data Search, Mining, and Visualization, Big Data Applications, Real-Time Big Data Analytics, Big Data Tools in practice & cetera.

Speakers for the Conference were selected through public Call for Papers and in partnership with our sponsors. You can find the speakers we selected through Call for papers in next section. Schedule of the Conference will be announced on 10th of September.

Last year we organized first-open Conference dedicated to Data Science on Balkan. Some of the Conference number were:

  • - 500 pax applications

  • - 300 attendees

  • - 2 working days, with 2 parallel tracks

  • - 24 speakers, 22 speach/workshops

  • - 580 likes on Facebook page, average post reach of ~5000 people

  • - 20 business partners

For more information about previous Conference, you can check our last year web site.










Meet our speakers:

Vladimir Markovic


BI Team Leader at Banca Intesa Beograd

Marko Smiljanic


CEO at Institute NIRI Ltd.

Zana Pekmez


CFO & COO at Privredna banka Sarajevo

Nenad Bozic


Co-Founder at SmartCat.io

Marko Vasiljevski


Data Science Lead at Flight Data Services

Milan Vukicevic


Assistant Professor at Faculty of Organizational Sciences

Mladen Jovanovski


Information Management Specialist at IBM

Branimir Todorovic


Machine learning consultant at Institute NIRI Ltd.

Darko Marjanovic


CEO at Things Solver

Petar Zecevic


CTO at SV Group

Marko Krstic


Advisor at RATEL

Ognjen Zelenbabic


Head of Content Insights Labs

Nikola Krgovic


Senior System Administrator at Limundo

Milos Milovanovic


Data Engineer at Things Solver

Jelena Lukić


BI team leader at Parallel d.o.o.

Marko Mitic


Business Data Analyst at Telenor Srbija

Jelena Milovanovic


Research Assistant at NIRI Intelligent Computing

Djordje Nedeljkovic


Teaching assistant at Faculty of Civil Engineering, Belgrade

Bojan Sovilj


1st Walker at Cloudwalker

Srdjan Mladjenovic


Analytics/Big Data Business Development Manager at Comtrade System Integration

Apply for Data Science Conference 2.0 now!

Don't delay any further moment and apply right now for Data Science Confrence 2.0! Our Conference is open and that means there is no participation fee.
Our motto is: Totally open, totally free.
Application form is open until 1st of October 2016! Important notice: Number of seats on Conference is limited.

Conference Schedule

with auditoriums

Apache Spark, and its components Spark ML and MLlib in particular, offer a range of possibilities for machine learning on large data sets. Spark contains powerful algorithms for supervised and unsupervised learning, regression, classification and clustering, as well as methods for data transformation and preparation. I will give an overview of these algorithms and methods and examples of Apache Spark programs using them.

Petar Zečević bio:

Working as Java developer, software architect, IBM software consultant for 15 years now. Author of "Spark in Action". Organizer of Apache Spark Zagreb Meetups.


Marko Vasiljevski bio:

Marko is a PhD candidate in physics. For his MSc he developed a platform for Monte Carlo numerical simulations in nanocomposite materials. He has a strong background in statistics, digital signal processing and programing coupled with years of experience in business, medical and academic sectors. He likes to combine different skills and tools to find patterns in data crucial for flight safety, as well as to develop machine learning algorithms.


Bojan Sovilj bio:

Bojan is one of the pioneers of the Data Science scene in Serbia. He finished Math faculty at Belgrade University. His vast experience comes from more than 12 years of working experience in Mozzart Bet - where he worked on big and complex systems. Curently, he is working at his own company Cloudwalker on the position of the 1st Walker.

As huge number of traditional TV programs and on demand video streams offered via Internet is now simultaneously available through hybrid broadcast broadband television, the search for an interesting content often turns into a time-consuming task for a viewer. In a situation like this, both the providers and the viewers would benefit from personalized recommender systems. The choice of neural network architecture and learning algorithm is mainly influenced by users’ privacy concerns and characteristics of data collected from user interactions. In this session, it will be discussed how to overcome these challenges by using feedforward neural network trained by cost-sensitive version of Extreme Learning Machine (ELM) algorithm and sparse ELM autoencoder trained with fast iterative shrinkage-thresholding algorithm, considering cases with and without “dislike” interactions, respectively. Through a series of tests it will be shown that proposed solutions improve system performance and consequently increase users’ satisfaction."

Marko Krstić bio:

Marko received both his bachelor and master degrees in Electrical and Computer Engineering from School of Electrical Engineering (ETF), University of Belgrade. He is currently a Ph.D. candidate at the same faculty. Although his interests during bachelor studies were closely related to the telecommunication networks, his research on master and Ph.D. studies is mainly focused on recommender systems and machine learning techniques. Three years ago he started to work at Regulatory Agency for Electronic Communications and Postal Services (RATEL) as Advisor in IT department. He is a holder of many certificates from which he would emphasize Data Science Associate certificate provided by EMC.

Neural network are systems modeled on the human brain which consist of number of neurons and connections between them. The neural networks weights are that what makes memory possible, i.e. acquiring certain knowledge, and they are modified through iterative learning process.In the process of learning, weight modifications are done by a learning algorithm and back-propagation (gradient descent) is the most famous one. However, the final result of back-propagation training is significantly dependent on initial weight values. Genetic algorithm is a stochastic search tool based on evolutive principles, which can be used as a learning algorithm without limitations. The scope of genetically trained networks is examined through the problem of credit risk assessment in banking, the research area known as credit scoring. Compared to back-propagation algorithms, experimental results on well known benchmark problems in this area (Australian and German credit data), show certain advantages of the genetic learning networks.

Srđan Mlađenović bio:

Srđan is Business Development Manager in Comtrade System Integration in charge for data analytics solutions development. His areas of expertise are information management, business intelligence (BI) and specifically predictive analytics where he have strong academic background in machine learning. Among more than fifty projects where he has been involved, the most mentionable are data warehouse and business intelligence project in Telekom Serbia as one of most data intensive project in sense of data volume and complexity of transformation required (ETL). His research interests are focused on predictive classification problems applied in churn prediction, credit scoring and other related domains.


Vladimir Marković bio:

Vladimir have comprehensive experience in DW/BI design and development primarily based on Microsoft and SAS BI platforms and products. This experience is further bolstered by years of working in implementation and development different kinds of DW/BI solutions and products. During his work in the bank, he’s gained broad business background in different fields of BI application especially in analytical customer intelligence, credit risk scoring, credit risk portfolio management and accounting. He is an experienced trainer and presenter. He enjoys sharing enthusiasm by presenting and promoting DW/BI at courses, user groups, technical events and conferences. Vladimir holds a MSc in Math and Computer Science from Faculty of Mathematics, University of Belgrade. Areas of his interest are dimensional modeling and data mining.

Recurrent Neural Networks (RNN) form a wide class of neural networks in which feedback connections between processing units are allowed. Applications of RNNs range from industrial process identification, modelling and adaptive control to financial time series prediction and classification, audio and video signal processing and sequence labeling in natural language processing. Echo state recurrent neural networks (ESNs) are arguably one of the most interesting recently proposed learning models in this field, since they have been considered as possible learning model in biological brains. In this presentation we first establish connection of ESN with some previously known recurrent network architectures and then propose a set of on line training algorithms, derived from recursive Bayesian joint estimation of RNN states and parameters.

Branimir Todorovic bio:

Branimir Todorovic´ is associate professor at Computer Science Department, Faculty of Mathematics and Sciences, University of Nis and Lead Scientist in NIRI, Nis. He received his Doctor of Science degree from Faculty of Electrical Engineering, University of Belgrade. His research interest include sequential Bayesian training of feed forward and recurrent neural networks, blind source separation and deconvolution, on line training of structural classifiers, active and semi-supervised learning algorithms and natural language processing.

We propose here a method for information extraction based on distributional vector space embedding of words and phrases. Embedding assumes that words and phrases are represented as dense real-valued vectors, and it is designed to satisfy the distributional hypothesis: words and phrases that occur in similar contexts tend to have similar meanings, and therefore they should have vectors which are close to each other in a vector space. We have extracted phrases using Pointwise Mutual Information and then learned word and phrase vectors, using as a training corpora set of business articles, job vacancies and employee resumes. We have produced the embeddings of words and phrases in a vector space, where distance measures difference between them. In a next step of information extraction procedure, we have applied hierarchical agglomerative clustering of vectors in order to detect clusters of similar entities. Using known entities from business domain as seeds, we were able to extract clusters which contain them. Inside such clusters of entities (words and phrases), we have found examples, which were semantically similar to the given seeds, but were not given as a component of the original query. In this way we were able to extract new entities.

Jelena Milovanović bio:

Final year student of the MSc course at the Department of Computer Science, Faculty of Science and Mathematics, Nis. Passed all exams with top mark and currently finishing her MSc thesis in the field of Machine learning. Employed at the Research and Development Institute NIRI as a Research assistant. Receiver of the annual city of Nis award for the best student of the faculty in 2016.

Web log analysis is a standard procedure on most sites. As the number of visits grow this is one of the first practical applications of Big Data systems. The goal of the presentation is to demonstrate, on an example, how to build a system to analyse web logs. As a basic tool I'm suggestion Cloudera CDH, as a tool for data collection StreamSets, and for keeping I suggest parallel storage in two formats: Tab separated for analysis, and ElasticSearch with Kibana front-end for quick insights and dashboards.

Nikola Krgović bio:

Nikola is Senior System Administrator with over 15 years of experience. Passionate about server architectures, unix systems and data storage.

On this speach you will get an overview of Apache Spark, a distributed computational framework, and see why it's a logical successor to Hadoop's MapReduce. Petar will describe the main Spark components - Core, SQL, Streaming, GraphX, and ML - and will show how to use them through short code examples.

Petar Zečević bio:

Working as Java developer, software architect, IBM software consultant for 15 years now. Author of "Spark in Action". Organizer of Apache Spark Zagreb Meetups.


Darko Marjanović bio:

Darko is cofounder and director of the company Things Solver, which main focus is on Big Data technologies. He is one of the founders of Data Science community in Serbia. He is mainly engaged in architecture of the Big Data application, collection and preparation of the data. His focus is on the Hadoop and Spark.

Having the domain knowledge, learning and improving your technical skills is a way to go and what is expected from any of us if we want to be considered as professionals. Ognjen will show you another dimension of business which will help you excel at what you do. He will tell you a story of how you can visualize and communicate your solutions.

Ognjen Zelenbabić bio:

Storyteller - Unveiling and communicating the secrets the data is hiding

Due to modern Big Data technologies there are lot of innovations in banking industry. With the massive emergence of the fin-tech industry users of financial services are becoming "bank-agnostic" and the key question for the bank management focuses around costumer acquisition and retentions. Modern banking needs to switch the focus on the clients and their needs as the relationship between costumers and the bank will definitely obtain some new forms. Having in mind that banks are faced with large volume, variety and velocity of data about their customers, the aim of this presentation is to show how data streaming and real-time analytics impact the banking services.

Zana Pekmez bio:

Zana Pekmez graduated economics in US with high academic recognition. She received master’s degree in 2010 at the Faculty of Economics in Ljubljana. Over 10 years she worked in banking industry where she gained comprehensive managerial experience. As a former Chief Operations Development Coordinator/Deputy Head of Operations at Raiffeisen Bank she endorsed change and challenges and demonstrated strong commitment to budget-driven strategy for improving business processes. Zana Pekmez is currently a Board member of Privredna Banka dd Sarajevo. Her current academic interests are: Business Analytics; Descriptive and predictive models in IS strategic planning and management (Enterprise Information Management (EIM)) and energy economics.

Jelena Lukić bio:

Jelena Lukić started her professional career in 2011 as Oracle Business Intelligence Consultant. Since 2015 she leads BI team at Parallel d.o.o. She received her bachelor and master’s degree from the Faculty of Economics, Univesity of Belgrade, where she is currently a PhD student. She is also memeber of e-development association and co-author on the blog Data Science Serbia.

A department, somewhere in EU, depends on having a steady input of 3000 new textual documents per day, 365 days a year. Documents come from 10 different sources and each document comes pre-classified into a single category of a large taxonomy. The department is unhappy: the accuracy of incoming document classifications seems to be low. Even after the department puts additional 800% FTE throughout the year to manually repair or discard wrongly classified documents, the accuracy still lags behind their targets. NIRI was hired to conduct a research and develop an accurate document classifier. The plan was to use NIRI’s classifier to replace the unreliable classes coming with documents, and thus solve the problem of low accuracy, as well as reduce the high cost of 800% FTE. In this talk we will share our experiences: classification approach used to meet the needs of our client, challenges in demonstrating progress during the project, and the approach used for the acceptance-validation of our classifier.

Marko Smiljanić bio:


The main idea of a Data Lake is to expose the company data in an agile and flexible way to the people within the company, but preserve safeguard and auditing features that are required for the company’s critical data. The way that most projects in this direction start out is by depositing all of the data in Hadoop, trying to infer the schema on top of the data and then use the data for analytics purposes via Hive or Spark. Described stack is a really good approach for many use cases, as it provides cheaply storing data in files and rich analytics on top. But many pitfalls and problems might show up on this road, which can be easily met by extending the toolset. The potential bottlenecks will be displayed as soon as the users arrive and start exploiting the Lake. From all of these reasons, planning and building a Data Lake within an organization requires strategic approach, in order to build an architecture that can support it.

Miloš Milovanović bio:

Miloš is co-founder and data engineer in company Things Solver. Also, he is one the founders of the Data Science community in Serbia. His prime focus is on analytics on big amount of data and data visualisation.

Back in the days, you had a single machine and you could scroll down the single log file to figure out what is going on. In this Big Data world you need to combine a lot of logs together to figure out what is going on. Data is coming in huge volumes, with high speed so choosing important information and getting rid of noise becomes real challenge. There is a need for a centralized monitoring platform which will aid the engineers operating the systems, and serve the right information at the right time. This talk will try to help you understand all the challenges and you will get an idea which tools and technology stacks are good fit to successfully monitor Big Data systems. The focus will be on open source and free solutions. The problem can be separated in two domains which both are the subject of this talk: metrics stack to gather simple metrics on central place and log stack to aggregate logs from different machines to central place.

Nenad Božić bio:

A craftsman with rich software engineering experience, an all-arounder, but when he does backend coding he feels right at home. Striving for knowledge is his main drive, which is why he enjoys learning about new tools and languages, blogging, working on open source, presenting. His current focus is Big Data and the surrounding ecosystem of tools which is why he co-founded the SmartCat company. Strong believer in balance between good technical skills and soft skills. A family guy who tries to spend the most of his free time with his wife and daughter.


Mladen Jovanovski bio:


Success stories of the Big Data paradigm and Predictive Analytics in many application areas led to the wide recognition of their high potential impact application areas like healthcare, marketing, finance etc. However, there is still a large gap between actual and potential data usage, because of numerous challenges: high dimensionality, sparsity, data heterogeneity, privacy concerns, the need for collaboration between domain experts and data scientists, demand for highly accurate and interpretable models etc. On the other side, extensive efforts of scientific research offer many partial or complete solutions for the aforementioned challenges. Coordination of research and industry efforts (fusion of cutting edge predictive analytics methodologies with commercial or non-commercial products) should lead to increased exploitation of Big Data promise, better satisfaction of industry needs and new methodological breakthroughs.

Milan Vukićević bio:

Milan Vukicevic is an Associate Professor at the University of Belgrade, Faculty of Organizational Sciences. He is also CEO and one of the founders of Big Data Analytics company, for consulting, research and development in the area of Predictive Analytics. He worked as a Visiting Researcher at the Data Analysis and Biomedical Analytics (DABI) Center at Temple University (2014-2015). His work was published in multiple conferences, journals and book chapters. Milan also had several invited talks on bioinformatics and healthcare predictive analytics topics.


Team behind the Conference is:

Dusan Dacic

Dušan Dačić

Project manager

Alex Linc

Aleksandar Linc

Program manager

Milos Djuric

Miloš Đurić

Business Dev manager

Zarko Stamenic

Žarko Stamenić

UI/UX manager

Milena Ivanovic

Milena Ivanović

PR manager

We are proud to present to you companies who supported us this year: