In football, once the transfer window opens, clubs starts bidding, buying, loaning, borrowing or selling their players. This transfers are done to get the ‘best’ players or sell the ‘worst’ players.

Transfers are pretty expensive, so clubs and leagues go for the best they can get or players to build their teams.

I would be doing an exploratory data analysis that answers these questions:

  1. Which clubs are the top 10 buying clubs?
  2. Which leagues are the top selling leagues?
  3. Which leagues are the top buying leagues?
  4. Does age determine the value of…


Threat detection is a huge and evolving field. Threat detection can come in form of cyber protection mechanism, or software in perceived important places. What of personal safety? How does one detect danger to themselves?


Sonix-Locale is the proposed product which is a threat detection mechanism for mobile phones that can detect danger using energy emissions from weapons.

Data is collected using heat wave detection software and labelled in accordance to weapon and fed to the model.

Business Case:

This is important as it not only improves personal safety of users, it also reduces money spent by the government in threat detection.

This is the second phase of this project. In this phase, I will be training and evaluating datasets on Google AutoML Vision on how different factors affect the output of a model. The first phase of this project which is creating and annotating dataset with Appen can be found here.

The dataset is divided into four categories all in each category’s zip folder. Each dataset was automatically split into 80% training, 10% validation and 10% testing.

Splitting dataset:

According to Appen, data annotation is the categorization and labeling of data for AI applications. This categorization and labeling is done to achieve a specific use case in relation to the business problem you are trying to solve.

There are people whose job it is to annotate these data for machine learning application, and are called annotators. These annotators without test validation can lead to bias in data as humans are unconsciously bias and could lead to inaccurate or improper labelling.

I will be demonstrating the use of annotating software provided by Appen to create annotation for detecting pneumonia in…

Linear regression is an algorithm that is used to visualize the relationship between two variables. The two variables which are used in this algorithm are the independent and dependent variables.

The independent variable is the variable that is not impacted by the other variable. When adjustments are made in this variable, the levels of the dependent variable will fluctuate.

The dependent variable is the variable that is being studied, and is what the regression model attempts to predict.

The relationship between the input variables (X) which is the dependent variable and the target variables (Y) which is the independent variable…

A Gaussian function is a function that cuts across mathematics and statistics. It is also used in signal processing for computer vision.

This function is a type of function that shows continuous probability distribution for a real-valued random variable. This function is characterized by a ‘bell-curve’ graph which signifies normal distribution.


μ = mean

σ² = variance, often written as its square value.


To represent uncertainty in the estimated location of a self-driving vehicle.


After importing the required library, I set the Gaussian function using the above equation and putting the exponential and coefficient into consideration. …

Machine leaning involves analyzing large sets of data to look for trends or correlations, and to use that to help characterize new observations and, in some cases, to perform tasks.

Using machine learning in research is like using other new tools in the practice of science.

In this article, I would be articulating the machine learning laboratory protocols used in medical imaging diagnosis.


This section is where you define what you are working on, the available statistics, previous methods or traditional methods used in working on this kind of project, the kinds of data that is usually associated with this…

According to the FDA, medical imaging refers to several different technologies that are used to view the human body in order to diagnose, monitor, or treat medical conditions. Imaging can come in different formats which includes; CT Scan, MRI, X-rays etc. This technique has been in existence since the 1960s and has evolved over the years into better ways of usage.

Artificial Intelligence has been infused into this sector of healthcare and has increased accuracy and precision medicine. It is also cost effective, faster, efficiency and reduced burn out in sectors that deal with medical imaging.

Applying AI in 2D…

Electronic Health Record (EHR) according to Wikipedia is the systematized collection of patient and population health information stored electronically in a digital format. These records are shared through network-connected or other information networks and exchanges. EHRs may include a range of data, from demographics, medical history, medication and allergies to immunization status, laboratory test results, radiology images, vital signs, personal statistics like age and weight, and billing information.

This project is a hypothetical case of a data scientist working with EHR for patient selection for diabetes and it is one of my projects from AI for Healthcare nanodegree program. This…

Feature scaling in machine learning is a process of calculating distances between data. There are so many methods of scaling data, but in this practice I worked with the standard scaler from scikit-learn.

Standard scaler involves standardizing a feature by subtracting the mean and then scaling to unit variance. This results in a distribution with a standard deviation equal to 1. The variance is equal to 1 also, because variance = standard deviation squared. And 1 squared = 1. It also makes the mean of the distribution 0. About 68% of the values will lie be between -1 and 1.


Nwosu Rosemary

Data Scientist || Machine Learning enthusiast and hobbyist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store