Using Data Science in Business Analysis


Course Name

Using Data Science in Business Analysis

Course Code

PD – IT – E2

Number of Contact Hours

45 hours

Credit Hours

3 Credit Hour

Duration and Frequency

  • 15 sessions 
  • Each session = 3 hours
  • Frequency: daily Monday to Friday
  • Duration: 3 weeks

Mode of Delivery

  • Online/ On Campus/ Hybrid 


Professional Development – 

E – Information Technology in Business


This course is In this course you will learn how to use data science tools for business decisions. Using open source tools, this course covers all the concepts necessary to move through the entire data science pipeline to analyse your business and make informed decisions.



On completion of this course, participants are expected to be able to:

  • Translate business questions into Machine Learning problems to understand what your data is telling you
  • Explore and analyze data from the Web, Word Documents, Email, Twitter feeds, NoSQL stores, Relational Databases and more, for patterns and trends relevant to your business
  • Build Decision Tree, Logistic Regression and Naïve Bayes classifiers to make predictions about your customers’ future behaviors as well as other business critical events
  • Use K-Means and Hierarchical Clustering algorithms to more effectively segment your customer market or to discover outliers in your data
  • Discover hidden customer behaviors from Association Rules and Build Recommendation Engines based on behavioral patterns
  • Use biologically-inspired Neural Networks to learn from observational data as humans do
  • Investigate relationships and flows between people, computers and other connected entities using Social Network Analysis


Course Outline:


Introduction to R


Exploratory Data Analysis with R

  • Loading, querying and manipulating data in R
  • Cleaning raw data for modeling
  • Reducing dimensions with Principal Component Analysis
  • Extending R with user–defined packages


Facilitating good analytical thinking with data visualization

  • Investigating characteristics of a data set through visualization
  • Charting data distributions with boxplots, histograms and density plots
  • Identifying outliers in data


Working with Unstructured Data


Mining unstructured data for business applications

  • Preprocessing unstructured data in preparation for deeper analysis
  • Describing a corpus of documents with a term–document matrix
  • Make predictions from textual data


Predicting Outcomes with Regression Techniques


Estimating future values with linear regression

  • Modeling the numeric relationship between an output variable and several input variables
  • Correctly interpreting coefficients of continuous data
  • Assess your regression models for ‘goodness of fit’


Categorizing Data with Classification Techniques


Automating the labeling of new data items

  • Predicting target values using Decision Trees
  • Constructing training and test data sets for predictive model building
  • Dealing with issues of over fitting


Assessing model performance

  • Evaluating classifiers with confusion matrices
  • Calculating a model’s error rate


Detecting Patterns in Complex Data with Clustering and Social Network Analysis


Identifying previously unknown groupings within a data set

  • Segmenting the customer market with the K–Means algorithm
  • Defining similarity with appropriate distance measures
  • Constructing tree–like clusters with hierarchical clustering
  • Clustering text documents and tweets to aid understanding


Discovering connections with Link Analysis

  • Capturing important connections with Social Network Analysis
  • Exploring how social networks results are used in marketing


Leveraging Transaction Data to Yield Recommendations and Association Rules


Building and evaluating association rules

  • Capturing true customer preferences in transaction data to enhance customer experience
  • Calculating support, confidence and lift to distinguish “good” rules from “bad” rules
  • Differentiating actionable, trivial and inexplicable rules


Constructing recommendation engines

  • Cross–selling, up–selling and substitution as motivations
  • Leveraging recommendations based on collaborative filtering


Learning from Data Examples with Neural Networks


Machine learning with neural networks

  • Learning the weight of a neuron
  • Learning about how neural networks are being applied to object recognition, image segmentation, human motion and language modeling
  • Analyzing labeled data examples to find patterns in those examples that consistently correlate with particular labels for object recognition


Implementing Analytics within Your Organization


Expanding analytic capabilities

  • Breaking down Data Analytics into manageable steps
  • Integrating analytics into current business processes
  • Reviewing Hadoop, Spark, and Azure services for machine learning


Dissemination and Data Science policies

  • Examining ethical questions of privacy in Data Science
  • Disseminating results to different types of stakeholders
  • Visualizing data to tell a story


Course Textbook


Statistics, Data Analysis, and Decision Modeling, 5th Edition

James R. Evans, University of Cincinnati



Feedback Given to Participants in Response to Assessed Work 

  • Individual written feedback on coursework
  • Feedback discussed as part of a tutorial
  • Individual feedback on request
  • Model answers 


Developmental Feedback Generated Through Teaching Activities

  • Feedback is given at presentations and during tutorial sessions
  • Dialogue between participants and staff in tutorials and lectures


The course grade will be based on a final project presented by the participant and graded by the instructor. Participants much achieve a passing grade of 70% or more to be awarded a certificate of completion of the course.