Profile
Experienced Data Professional with proven record of working in the higher education and social science research sectors.
Specialized in hierarchical data build for rigorous academic research, skilled in Data Analytics, Python, R, Stata, Hadoop, Spark, Tableau,
PostgreSQL,Linux and Powershell.
Econ graduate with strong background is econometrics, data science and machine learning.
I am seeking a new role as a Cloud Data Engineer, Database Developer or Analytics Engineer.
Education
Skills (Technical Proficiency = 50%)
Data pipeline and scripting languages
AWS Cloud | Big Data | HDFS
Data Science and Machine Learning Projects
Machine Learning: Using diagnostic measurements to predict diabetes
Missing data analysis, data augmentation techniques, modeling data using scikit-learn algorithm, parameter tuning using XGBoost, Random Forest and ExtraTrees, cross-validation method using GridSearchCV @samuel-ntsua
Data Science: Used tools such as Hadoop, Hive, Sqoop and Oozie to create data pipeline to pool stock exchange data for analysis.
This project involves writing complex HiveQL queries to compute top 5 Return On Investment (ROI) as well as the best year for investment for a stock index. @samuel-ntsua
Machine Learning: Reduce the time a Mercedes-Benz car spends on the test bench by selecting the best combination of features that yield the optimum vehicle security outcome.
I performed a dimension reduction (using PCA and XGBoost regressor) after training, validating and testing a prediction model that seek to select fewer features to reduce vehicle testing period, yet predict higher safety standard for the car. @samuel-ntsua
Data Science with R Programming: Used sample data from hospitalization in a city to predict a generalized, statewide hospitalization expenditure
Build linear regression model to predict the cost of hospitalization. @samuel-ntsua
Work Experience
• Leverage expertise in the fields of statistics, education, economics, and data science to concurrently manage data of
over 10 research and evaluation projects.
• Build a range of indicator statistics compiled into analytical datasets for use by university professors conducting
academic research and state program evaluations.
• Build and manage a 2.5-terabyte data warehouse to store curated historical datasets while allowing for fast and
complex retrieval and a logically understandable system.
• Write, test and debug data pipeline scripts in Apache Airflow DAG (Linux/SSH, Python/Pandas, PostgreSQL), and Stata
to build OLAP datasets used in research studies and evaluation work;
thus increasing productivity and reducing processing time by over 60%.
• Write, review and update procedural guides, troubleshoot end-users remote access to multi-user Linux cluster, assist
research analysts and external collaborators to schedule and run SLURM jobs; resulting in an increased productivity and
reduced supervision burden.
• Directed the development of a Mobile Computer-Assisted Personal Interviewing (M-CAPI) survey to capture
schoolchildren’s learning levels across 500 elementary schools, in a multi-round national survey.
• Implemented ODK and maintained data pipelines to stream live survey into backend data warehouse.
• Coordinated data collection staff, equipment and financial resources (recruitment of +300 enumerators for CAPI deployments).
• Orchestrated the instruction and development of 280 enumerators over eighteen months, enabling enumerators to
successfully survey more than 80,000 pupils in four weeks, shortening an 18-month process to 3 months (83% gain) thus allowing for corrective policy to be implemented in the same
school year.
• Identified students’ areas of difficulty in economic theory.
• Helped students understand basic economics concepts to help raise their grades in Microeconomics and Econometrics.
• Served as resource person of the department of economics for statistical applications such as STATA,SPSS and SAS.
• Acquired and reviewed borrowers’ financial information, including book-value determination, asset valuation reports,
business transaction history, collateral reports, market outlook and profit potentials, and credit ratings in order to
prepare clients’ credit risk reports.
• Assessed client’s requests for loan modification, progressive collateral release, and loan term extension for senior
management approval.
• Coordinated consistent and effective communications and operations among loan officers, clients, and disbursement
departments.
• Monitored and facilitated personal and commercial loan origination and mortgage closing documentation.
• Bridged intercultural and linguistic gaps by educating customers on financial instruments and products in both French and English, such as banking products to improve financial and credit status.
• Helped mobilize financial resources for operations by:
- originating and maintaining resource allocation charts
- gathering necessary data to computed budget estimates
- executing funding allotments and payment obligations for Development Assistance Programs (DAP), International
Cooperative Administrative Support Service (ICASS) as well as for the Ambassador’s Special Self Help (SSH) fund.
- verifying and tracking funds to ensure apparitions laws are not violated
- allowing for timely annual review of program activities as well as improved availability of information for
decision-making.
• Analyzed existing program directives and federal legislation to inform and draft annual budget formulations.
• Investigated cost centers to determine budget burden, enabling a well-balanced, prorated cost sharing, and equitable
funds replenishment.
• Gathered transaction data to simulate and justify quarterly budget and financial plans.
• Successfully managed annual portfolio of funds of $50+ million for three consecutive years.
• Processed accounts payable and decreased processing time by a week.
• Verified billing computations for accuracy; Created and managed dynamic spreadsheets to track lease
contract renewals and due dates.
• Processed time and attendance of 250+ staff; answered staff sick and annual leave as well as expatriates rest and recuperation travel
reimbursement related questions, reconciled monthly payroll expenditure with bank statement of accounts.
Certificates
Cloud Practitioner
Solution Architect - Associate: In progress
Data Analytics - Specialist: In progress
Apache Kafka, Astro (nee Airflow): In progress
Big Data Hadoop and Spark RDD - (Simplilearn)
Data Science with R - EMCDSA (DELL-EMC2)
Linux Essentials (LPI)
Certificate in Survey Research Methodology (UNC-CH)
Data Science using Python and R- (Simplilearn/DataCamp)
LANGUAGES
-
French (My first university degree was entirely taught in French)
95%
-
Ɛ̀ʋɛ̀ Gbè (Mother tongue: Spoken in Ghana, Togo, Benin and Nigeria)
100%
-
Gã(Mother tongue:Spoken in Ghana)
100%
-
Akan/Twi (Spoken in Ghana and Côte d'Ivoire)
65%