I'll be using a heart disease dataset from the University of California-Irvine. I'll do progressive logistic modeling which factors the cost of the diagnostic test so at each stage of modeling, we can quantify the model improvement. This will reveal the cost & benefit of these tests and potentially help streamline procedures to save money and time while offering strong predictive value.
In this project we worked on solutioning CLTV for retail industry. This metric represents an estimate of their interactions with discount and other promotional offers. The essential information about the customers such as their gender, income, offers they used was extracted and analyzed. I also created custom metrics such as view rate and conversion rate for each customer to help in making business decisions and eventually I applied K-Means clustering algorithm to cluster the customers whom the offers were send.
Credit cards industry is a lucrative business. Most banks that issue credit cards run acquisition campaigns to acquire customers. Direct mail campaign is one of the acquisition strategies employed by the Fifth Third Bank. This involves a physical mail document being sent to a prospective customer with an offer such as balance transfer offer or spend to get cashback offer or zero percent APR offer for a certain period.
The project seeks to improve the population selection strategy of the direct mail campaigns for the Fifth Third Bank. The population for a direct mail campaign is selected by considering a variety of factors including marketing costs, mail offer, response score deciles, present value of the prospective customer, FICO score, response rate of the customer. Based on these factors, a metric, Return on Marketing Investment (ROMI) is calculated on FICO group level and response score decile level. Only the population that meets ROMI cut-off is selected for the direct mail campaigns.
Elective healthcare is one of the categories where credit has not been accessible historically. Lending in such categories comes with unique challenges and opportunities. The size of such loans is between a credit card loan and mortgage loan. Greensky patient solutions offers credit in this space. As a part of the credit strategy team, the objective here is to choose the most profitable customers and serve them. Using past loans data, we picked the variables which are responsible for defaults in these loans and try to incorporate them into a model. These factors are usually related to the creditworthiness of an individual. The model does a good job of predicting the default after a specific period of time, as it performs well on out of sample data. We then select the right cutoff point for the model response in order to maximize the future value of the portfolio. This way, we have the optimal model for current economic conditions. The model can be used until there is a significant change in macro factors as they would affect things key factors such as default rates.
Direct Auto Loan is a popular lending product. Auto loan is credit financing for consumer purpose for the purchase or refinance of an automobile with the loan amount ranging from $2,000-$80,000. The objective of the project is to create a forecasting model for direct auto loan that the regional bank offers based on historical data spanning from January 2019. The analysis will be based on the data from 2019 till date, which will be used further to forecast the trends for Direct Auto till December 2022.
The capstone project is about analyzing Home Equity Line of Credit (HELOC) balances, payments and draws for Fifth Third Bank over 2019 to 2021. We observe that balances and balance / balance active have consistently reduced since COVID hit in Mar 2020. Two products - HELOC with a fixed term balloon and term product (different time periods for draw and amortization) have similar balances as of Oct 21. Balances per customer risk levels and combined loan-to-value (CLTV) have also been analyzed and risk levels 2-3 constitute majority of the balances. The number of customers reaching time to maturity have doubled over the period 2019-2021. For the term product, the draws and draw per draw active have seen a spike due to re-introduction of a promo offer.
One of the features currently in development is the Trailer Unload Prioritization, that aims to identify the trailers with high values items for restocking in the clubs and Fulfilment centers. The prioritization looks at various parameters like the value of items, in stock status, special offers etc. to determine which items need to be unloaded on priority to minimize lost sales and improve member experience.
The purpose of this study is to analyze what factors play a significant role in increasing customer engagement at grocery stores. Strong customer engagement is important in the retail industry as it will foster customer loyalty and growth in sales. First, a demographic analysis was done in order to examine if there is any association between demographic factors and customer spending. Next, Market Basket Analysis was performed and association rules were created using the Apriori algorithm in order to uncover relationships between products that are often purchased together. The demographic analysis revealed that there is a positive association between customer income and spending as well as household size and spending and negative association between customer age and spending. Helpful recommendations were discovered in regards to product cross selling, promotional offers and store layout from the Market Basket Analysis. Understanding how demographic factors affect customer spending and implementing recommendations based on association rules will lead to an increase in customer engagement at grocery stores.
Throughout the Covid-19 global pandemic, many businesses have suffered financially or closed due to reduced business and lost revenue. In response to mandatory quarantine enforcement, some businesses have been able to adapt through increased investments in e-commerce offerings and the online user experience. These investments are often expensive, and it can be difficult to pinpoint which changes would lead to increased revenue. By performing logistic regression on e-commerce clickstream data and thorough variable selection, key metrics can be identified that significantly impact the potential revenue generation of each online shopping session. The analysis identifies six clickstream metrics that have a statistically significant impact on whether revenue is generated from an online shopping session. The resulting analysis provides a good basis for further exploration into key performance indicator tracking for businesses hoping to becoming more competitive with their e-commerce offerings.
Ridesharing services are companies that match drivers of private vehicles to those seeking local taxicab-like transportation. Ridesharing services are available mostly in large cities in many countries. Some of the biggest names in the industry are Uber, which exists in 58 countries and whose name is almost synonymous with ridesharing services, and Lyft, which covers many American cities. Uber and Lyft both are American multinational ride-hailing companies offering services that include peer-to-peer ridesharing, ride service hailing, and a micro-mobility system with electric bikes and scooters. Their platforms can be accessed via websites and mobile apps. In California, Uber is so dominant that it is a public utility, and operates under the jurisdiction of the California Public Utilities Commission. Ridesharing systems generate a lot of data that can be used for studying mobility in a city. The ability to predict the peak time of the day and the day of the week can allow the businesses to manage them in a more efficient and cost-effective manner. Our goal is to use and optimize Machine Learning models that effectively predict the price using the available information about that time/day and the weather conditions.
VNDLY is a leading provider of cloud-based contingent workforce-management systems. Launched in 2017, it has grown quickly with multiple Fortune 500 clients and is backed by investments of over $57 million. In its efforts to disrupt the vendor-management space, VNDLY plans to provide dashboards as a product offering to its clients to aid their decision making and better manage their non-employee workforce needs. Also, being a product-oriented organization, VNDLY strives to leverage analytical dashboards to make key product related decisions. This capstone involves dashboard development to refine requirements, build prototypes using Tableau and ensure data visualization best practices are followed in the product offering for our clients. Analytical dashboards are created using Google Data Studio by sourcing product usage data from Google Analytics. These dashboards follow design principles for effective data visualizations and hence aid the decision making around product priorities for internal stakeholders at VNDLY.
With the rapid development of telecommunication industry, the service providers are inclined more towards expansion of the subscriber base. To meet the need of surviving in the competitive environment, the retention of existing customers has become a huge challenge. Firms are directing more effort into retaining existing customers than to attracting new ones. To achieve this, customers likely to defect need to be identified so that they can be approached with tailored incentives or other bespoke retention offers. Such strategies call for predictive models capable of identifying customers with higher probabilities of defecting in the relatively near future.
Anupreet Gupta, Strategy and Portfolio Analytics, August 2019, (Mike Fry, Siddharth Krishnamurthi)Credit cards have become an important source of revenue for the bank, as it charges a higher Annual Percentage Rate (APR) in-comparison to any other consumer lending