Principal Data Scientist - Emerging ML
Data is at the center of everything we do. As a startup, we disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and the relational database, cutting edge technology in 1988! Fast-forward a few years, and this little innovation and our passion for data has skyrocketed us to a Fortune 200 company and a leader in the world of data-driven decision-making.
As a Data Scientist at Capital One, you'll be part of a team that's leading the next wave of disruption at a whole new scale, using the latest in computing and machine learning technologies and operating across billions of customer records to unlock the big opportunities that help everyday people save money, time and agony in their financial lives.
Team Description
Emerging ML is the data science and machine learning team inside Capital One's Applied Research organization. We focus on research and development of new technologies within the domain of Artificial Intelligence with a focus on Embeddings and Foundation Models. We partner closely with our product and engineering teams to connect emerging technologies with business critical use cases across Capital One's lines of business.
As part of Emerging ML, you will work on things like:
- Conducting research into self supervised learning, transformer models, and representation learning
- Building customer behavioral models (using transaction, clickstream, and other data) that identify trends, patterns, and relationships related to product usage
- Refining integration patterns for encoder and decoder models for downstream use cases to connect Applied Research products and business use cases
Role Description
This is an individual contributor position. In Emerging ML, you will work at all phases of the data science lifecycle, including:
- Build machine learning models through all phases of development, from design through training, evaluation and validation, and partner with engineering teams to operationalize them in scalable and resilient production systems that serve 50+ million customers.
- Partner closely with a variety of business and product teams across Capital One to conduct the experiments that guide improvements to customer experiences and business outcomes in domains like marketing, servicing and fraud prevention.
- Write software (Python, Scala, e.g.) to collect, explore, visualize and analyze numerical and textual data (billions of customer transactions, clicks, payments, etc.) using tools like Spark and AWS.
The Ideal candidate will be:
- Curious and creative. You thrive on bringing definition to big, undefined problems. You love asking questions, and you love pushing hard to find the answers. You're not afraid to share a new idea. You communicate clearly and effectively to share your findings with non-technical audiences.
- Technical: You have hands-on experience developing data science solutions from concept to production using open source tools and modern cloud computing platforms. You are not afraid of petabytes of data.
- Statistically-minded. You have built models, validated them and backtested them. You know how to interpret a confusion matrix or a ROC curve. You have experience with clustering, classification, sentiment analysis, time series analysis and deep learning.
- Customer and product oriented. You share our passion for changing banking for good.
Basic Qualifications
- Currently has, or is in the process of obtaining a Bachelor's Degree plus 5 years of experience in data analytics, or currently has, or is in the process of obtaining a Master's Degree plus 3 years in data analytics, or currently has, or is in the process of obtaining PhD, with an expectation that required degree will be obtained on or before the scheduled start date
- At least 1 year of experience in open source programming languages for large scale data analysis
- At least 1 year of experience with machine learning
- At least 1 year of experience with relational databases
Preferred Qualifications:
- Masters in "STEM" field (Science, Technology, Engineering, or Mathematics) plus 3 years of experience in data analytics
- Experience building transformer models at scale (>100M parameters)
- Understanding of self-supervised learning methods
- Strong foundation in software engineering
- At least 1 year of experience working with AWS
- At least 2 years' experience in Python, Scala, or R for large scale data analysis
- At least 2 years' experience with machine learning
- At least 2 years' experience with SQL
#J-18808-Ljbffr