Dears,
● Responsible for designing, coding, testing, deployment phases of Data science project .
● Feature engineering include Numerical Imputation,Handling Outliers with Standard Deviation,Outlier Detection with Percentiles,Binning,Log Transform,One-hot encoding,Grouping Operations,Feature Split,Scaling using Normalization,Standardization
● Cleaning the Data using Spark RDD tools and SQL .
● Working on NLP in natural language understanding and generation and chat bot.
● Used Deep learning tools google Tensor flow and Keras using python language.
● Created Classification model for customer segmentation using XGboost and SVM and Random forest.
● Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
● These text files will then be loaded to Hive External table through Oozie workflow. Hive views will be created on top of this table and interfaced with Data Visualization tools like BO to generate reports.
As part of the project we have aggregated data from json,XML,csv files and processing them in Hadoop and store them in Hive and transform and store them in vertica and generate reports using qlick view
--
Regards,
V Rama Prasaa Reddy SAFe4 Agilist(SA), Certified Scrum Master ( CSM ),PMP, CMST, ITIL V3 Foundation Certified, ACSE
+91-9620100445,