Have you ever predict the score of any IPL match in between a inning.

Prateek Mishra
16 min readOct 9, 2020

All of us know that India is a country of secularism , where there is multiple types of religious people are lived in our country , There is no any differences between the religions in our country as per the Government rules & regulations. There is mainly six types of religions are known in our country , which are Hinduism , Islam , Buddhism , Christians , Sikhism and Jainism. So all of them have celebrate their own festivals during the entire year , they also participate in each others festivals to share their happiness and show their unity. But if you know about our country India there is a special type of festival , which is commonly celebrated by the all the religions at a single platform. Yes it is absolutely right.

IPL 2020

So, we are talking about a festival which is known as “India Ka Tyauhaar” i.e. IPL ( Indian Premier League ). It is a cricket premier league which was started in April 2008 , after that this league is celebrated as a festival in our country. Every year we Indians are waiting for this league or tournament , because of it is connected with our emotions. In this tournament 8 teams have played against each other to win this tournament. These 8 teams were belongs to every regions of our country like south India , North India , Central India and also from the rest part of India. There are 8 teams which names are Chennai Super Kings , Royal Challengers Bangalore , Kolkata Night Riders , Mumbai Indians , Delhi Capitals , Sunrisers Hyderabad , Kings XI Punjab , Rajasthan Royals. In this tournament the player will participated from all across the world and play for their franchise , although in this tournament some of our favorite players have also played for their franchise , because of all the team are playing under some franchise and they have their owners. So we enjoyed this tournament too much , it’s treated or celebrated as a festival. So in this festival we enjoy this as a fan or as a audience not celebrate it as a religious festival. So that’s why it is known as “India Ka Tyauhaar” . IPL is also known as Incredible Premier League , no-one can predict what will happen in this tournament. But we are came up with a solution for your excitement or to predict what will happen in next moment of this tournament. So after a lot of research & data collection we have came with a solution to predict the first inning score after the first 5 over of the match , this will decide the structure of the first inning where would be it lies in the last moment of the inning.

So now the question is what we have done to predict the score of IPL match. So we have create machine learning model & trained it as such that so when we give the input to our model in the form of inning ongoing score then our model will predict the first inning average or predicted score. Actually this article is totally based on our project.

Name of the technology we have used in our project are followings :

  1. Python
  2. Machine Learning
  3. AWS Cloud
  4. Ansible Automation Tool
  5. DevOps ( Git , GitHub , Jenkins , Docker )
  6. Web Development using Flask

So now another questions that would be raised by the people is that why we are doing this project , what are the purpose behind this project. So we want to tell you that we are doing this projects when we learn something and after that to analyze or test our knowledge is to do something or make some projects on that technology.

So to design or make any machine learning model we have to find a best data set for our model then we can predict the better result for model , without a prefect or genuine data we can not expect for a better output with less number of loss and or a model with best accuracy. So we have searched to many dataset for our model then we find a perfect data set about the history of all the IPL match , it contains ball by ball record of every match till IPL 2017. So we get the data from the best platform in the world of data science is Kaggle.

So if you have any basic knowledge about the programming world or in machine learning world then you must know that to run any program we need a operating system. The main work of any operating system is to only run the program or to store the data , the program might be in any form , either it would be application or either it may be a simple program belongs to any programming language. So we also need a operating system to run our machine learning code. So we can perform this project on our local system as well as on the top of any cloud. But nowadays we all are moving towards the use of cloud computing , because if we go any where and we don’t have our local system and we have to showcase our project to any clients or in front of anyone then at that moment of time we may stuck there, at that time if our projects had been already running on the top of cloud , so we have a great solution in the form of cloud.

In our projects we have used AWS Cloud to deploy the machine learning model as well as web application. Now the question is Why we are using the AWS and how do we deploy our project on the top of AWS cloud ? We are using AWS cloud to provisioning a instance & on the top of that we will deploy our machine learning model and web application. We can launch any instances on the top of AWS by multiple methods which are followings :-

→ Through AWS GUI.

→ Through AWS command line.

→ Through Terraform Code.

→ Through Ansible-Playbook.

For our project We have used the fourth method i.e. Ansible-playbook option to provisioning a instances. We have to provide the AWS credential to the ansible-playbook code , then we have to write a ansible playbook to provision a instance. So there is also a two methods to write the ansible-playbook code , first one is simple playbook option and second one is Roles. We are using the Roles so that we can share our project to anyone , it will be easy for them to use it. When we create any role it will create multiple folder that will required for any projects. In ansible we have a command to create a role , the command is :-

ansible-galaxy init ec2prov

As you can see that there is 6 folders which is created inside a roles. The directories are defaults , handlers , meta , tasks , tests and vars. So these folders have some specified work in roles , like tasks folder is to write the playbook for task , Vars folders is used to write the code to store the variable. So first task is to provision the AWS instances. We write a playbook to pass the access key and secret key i.e. credentials for our AWS account inside the Vars folder. After that we will write a playbook for all the tasks inside the task folders.

So we write the playbook inside a roles for our all the task that would be required for the provisioning of instance on the top of AWS . We have used some variables in our code that would be secret for our use case. So we use the two key word to access that variable in our code i.e. access_key and secret_key . In our var folder of the respective role , we use the same keyword to store the value. Now we write the code and question is how to run this role ? So we have to write main playbook to run the role to deploy the entire setup of AWS instances. In that playbook we have to just mention the role name in that file our host name where we have to deploy the setup.

Now to run this code or playbook we have to use a command in our running OS where we are running our ansible automation tools. The command is :-

ansible-playbook ec2prov.yml

As you can see that our ansible playbook has run successfully , so it means that our instance has been successfully launched on the top of AWS.

We have launched a operating system or instance , but the question is what we have to done with this instance. So First we have to configure the Docker tool inside our instance because we will done all the practical with the help of it. To configure the docker tool we have two methods configure it, first one is manual method or second is with the help of ansible tool , we choose the ansible automation method . We have again create a role to configure the docker tool , so the command is :-

ansible-galaxy init config

As you can see that there is 6 folders which is created inside a roles. The directories are defaults , handlers , meta , tasks , tests and vars. So these folders have some specified work in roles , like tasks folder is to write the playbook for task , Vars folders is used to write the code to store the variable. We have written a playbook for all the tasks , we divide these tasks in two parts , first part is to install the docker tool and second part is pull the docker image from the docker hub and for launch a container. We are discussing the first part.

To install the docker we have to create a yum repository for that , then we have to install the docker through this repository & then we will start the docker service to use it.

Now our next work is to pull a docker image from the docker hub to launch a docker container , so we have create our own docker image with the help of Dockerfile as per our project requirement and upload it on docker hub repository. So for that we have written a code to pull the docker image and launch a docker container as below.

We have written a main ansible playbook to run the role for configuring the Docker tool.

Now to run this code or playbook we have to use a command in our running OS where we are running our ansible automation tools. The command is :-

ansible-playbook playbook.yml

Now we have created our own Dockerfile so we write a a entire file as per our project requirements , so our image has pre configure python3, Jenkins , java JDK , net-tools , some important libraries of python ( Seaborn , matplotlib , scikit-learn , pandas , numpy , flask , dateTime etc. ). We had build this docker image because of we will trained our machine learning model inside that docker container and to host our web application using flask.

After writing the code for a Dockerfile , we have to built it so that we can use it for our use case. We have a command to build any docker image i.e.

docker build -t sklearn(image-name):latest(version) .(path of Dockerfile)

After building the docker image we have to login with our docker hub account so that we can push(upload it official repository). Then after that we have to push it , for this we have used a command i.e.

docker login docker push techcrew/sklearn:latest

This is all about the configuration of the Docker , launching of the docker container and building of our own docker image.

We have launch a docker container for our project and then We have used Jenkins ( which is CI/CD tool ) to train/build machine learning model and to deploy on web Application using Flask.

We have used Jenkins file approach in which we used Groovy language (DSL) to create a complete CI/CD pipeline. All the Jobs we have configure with the help of DSL plugin using Groovy language. All the jobs required for projects are mention below:-

Jenkins Job 1 :- In which first we pull the data from GitHub and store in /root/mlops/ in directory.

We got the proper output when we compile or run the Job 1 .

Jenkins Job 2 :- This job will run Machine Learning code to create a Machine learning Model.

We have trained our machine learning model with the help of Jenkins tool , now we will discuss about our machine learning model, what it does actually for us ? So before starting the machine learning model training part we will discuss about the main thing required to train a machine learning model i.e. dataset required for it , which features of the dataset will affect our model while predicting the value for our future reference with better accuracy or less error. You can download this dataset and link for downloading is given below.

https://raw.githubusercontent.com/techcrew5/mlops-project/master/ipl.csv

Dataset

We have thousands of observations in our dataset. There is 15 columns & 76014 rows in our dataset , it means we have 76014 observations for our machine learning. We have used the multiple python libraries that would be required for our model training , Pandas , Numpy , Seaborn , Scikit-learn , Matplotlib , DateTime. All these libraries have their special functionality & their uses as per the requirements.

Now we will discuss about our machine learning code, how did we write our code , how did we trained our model and save the model ? We will discuss our machine code step by step , how do we go ahead while writing the code.

→ Step 1: To load and read the dataset

→ Step 2: Removing the unwanted features ( data cleaning )

→ Step 3: Feature Scaling ( changing the categorical features in to numerical values with the help of one hot encoding )

→ Step 4: Model training using the Lasso Regression Algorithm.

Now the question may be raised by you that why we used the Lasso Regression algorithm ?

LASSO — Least Absolute Shrinkage and Selection Operator , It is a powerful method that perform two main tasks: regularization and feature selection. During features selection process the variables that still have a non-zero coefficient after the shrinking process are selected to be part of the model. The goal of this process is to minimize the prediction error.

As the value of coefficients increases from 0 this term penalizes, cause model, to decrease the value of coefficients in order to reduce loss. In practice the tuning parameter λ, that controls the strength of the penalty, assume a great importance. Indeed when λ is sufficiently large then coefficients are forced to be exactly equal to zero, this way dimensionality can be reduced. The larger is the parameter λ the more number of coefficients are shrinked to zero. On the other hand if λ = 0 we have an OLS (Ordinary Least Square) regression.

→ Step 5: Save the model

After writing the code the model have been trained using the Jenkins , we have mention the output below.

Jenkins Job 3 :- This job will host our web Application for machine learning model.

For designing our web application we have used a python library which is Flask , to integrate our machine learning model. These are some basic terminology which we have used to design our web application are followings.

→ HTML : Hypertext Markup Language is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets and scripting languages such as JavaScript. I used Html inside my web App for preparing the structure of web pages.

→ CSS : Cascading Style Sheets is a style sheet language used for describing the presentation of a document written in a markup language such as HTML. CSS is a cornerstone technology of the World Wide Web, alongside HTML and JavaScript. I used CSS here inside my web application for make the web UI attractive and confederate.

→ BOOTSTRAP : Bootstrap is a free and open-source CSS framework directed at responsive, mobile-first front-end web development. It contains CSS- and JavaScript-based design templates for typography, forms, buttons, navigation, and other interface components. I used Bootstrap here for make this web app responsive in any size of device.

→ KIT FONT-AWESOME : Kits are the fastest and easiest way to get Font Awesome up and running on your website. It’s a little bundle of settings and icons that we think is the best way to use Font Awesome. By creating a kit you: Save time getting icons on your web site. Make your icons load super fast. i used here Kit Font Awesome library for access the icon throw the network.

→ Flask : Flask is a micro web framework written in Python. It is classified as a microframework because it does not require particular tools or libraries. It has no database abstraction layer, form validation, or any other components where pre-existing third-party libraries provide common functions.

→ Flask-Static-Folder : Flask automatically adds a static view that takes a path relative to the flask/static directory and serves it. The Index.html template already has a link to the style.css file:

→ Flask-Templates-Folder : Templates are files that contain static data as well as placeholders for dynamic data. A template is rendered with specific data to produce a final document. Flask uses the Jinja template library to render templates. In your application, you will use templates to render HTML which will display in the user’s browser.

So we run this code with the help of Jenkins , we have deploy our web application in a docker container which running in our AWS EC2 instance. The output of this job is given below:

Jenkins Job 4 :- This job will check the web application web server is running or not, if not then it will send a email to developer

This job will notify to developer via mail, if our web application not in a running state. We have used python scripted file ( mail.py )to send a mail to a developer. When we run our Web application with our AWS instance public IP , it will open the below page for you . If you also want to view our web page then click on the following link: http://50806185:1234

So this is all about our project , if you have any queries while reading our article then please contact us , we will feel happy to help you or solve your queries.

We have completed this project in a team of four members , our team name is ( TECH CREW ) . Our team members names and LinkedIn profile links are mention below:

  1. Prakash Singh Rajpurohit ( LinkedIn Profile )
  2. Gagandeep Gupta ( LinkedIn Profile )
  3. Prateek Mishra ( LinkedIn Profile )
  4. Ankit Kumar ( LinkedIn Profile )

Our Project has been also uploaded on our team GitHub account , so you can also visit there to view our code.

Thank you so much to our mentor Mr. Vimal Daga sir 🙏🙏, we have learned all these things under the guidance of him only.

At last , we want to pay our gratitude to all the readers for reading our article with patience 🙏🙏🙏🙏.

--

--

Prateek Mishra

I am a tech enthusiast who thrives on experimenting with cutting-edge technologies