Contribute to our portfolio companies’ success by browsing open positions below.

Principal Software Engineer at Lumiata
San Mateo, CA, US
We are seeking a Principal Software Engineer with a focus on handling big data problems, to join our nimble and growing software engineering team. This individual will play a critical role in helping develop and maintain Lumiata’s core data science and machine learning infrastructure. She will be working on and helping to develop pipelines for training and deploying machine learning models and building high performance systems to understand complex patient data at scale.

As a Principal Engineer at Lumiata you will learn in detail how medical information flows from patient all the way through to meaningful insights and the many intricacies involved along the way including but not limited to:
  • How AI and machine learning can transform healthcare.
  • How to build and deploy novel machine learning products.
  • How medical information is stored and communicated between different actors in the healthcare system.
  • What modern, open standards have been developed to better communicate and represent medical data.
  • What specific standards must be respected and how to ensure compliance to handle sensitive healthcare data including HIPAA, SOC2 and HITRUST among others.

As a Principal Engineer, we expect you to have strong leadership presence, and not only learn these aspects, but also help disseminate and evangelize with the more junior members of the team.
Key Responsibilities
  • Research and design of algorithms to solve problems using medical & healthcare data.
  • Designing and building distributed systems to manage various workflows and data/model lifecycle.
  • Be a strong example to the more junior members of the team.
  • Have a strong outward facing technical presence (blog posts, open source, conferences, etc).
  • Lead the overall architecture and design for Lumiata software across the organization.
  • Building and developing pipelines to:
  • Handle large volumes of heterogeneous medical data.
  • Train/update/version/maintain production machine learning systems
  • Expose multilayer machine learning systems to customers while maintaining first class privacy and security standards.
  • Extract and expose external public data sources to enhance medical machine learning models
Basic Qualifications
  • At least 10 years of experience of the following:
  • Software development and design
  • Writing production code using one or more of the following, C++, Java, Scala, Go and Python
  • Experience working on large-scale distributed systems

At least 5 years of experience in:
  • Big data and machine learning applications, for example: Hadoop, Spark, Kubernetes, Tensorflow, XGboost, etc
  • Experience with cloud infrastructure, AWS, GCP or Azure
Preferred Qualifications
  • Experience in high concurrency platforms, graph databases, applied mathematics
  • B.S., M.S. or Ph.D. degree in Computer Science or related fields.
  • Substantial contributions to open source projects
  • Ability to inspire and motivate others
  • Leader who sets high standards
  • Leader with a critical eye who knows how to provide feedback in productive ways
  • Entrepreneurial spirit
  • Strategic, high level thinker, that also is willing to and can get very deep in the weeds
  • Have a healthy sense of urgency and a good instinct for effective rapid prototyping

About Lumiata
Lumiata delivers Machine Learning powered health analytics to make healthcare smarter. At the intersection of clinical knowledge, data science and machine learning, Lumiata provides cost and risk analytics to health plans, care providers and employers.
We process TBs of patient data per customer, which we use to train models that are used to solve our customer’s prediction and classification problems. We use a variety of ML techniques, ranging from simple linear regression, decision trees, SVMs and deep learning. We’re building a Big Data / Machine Learning platform for managing PBs of data, as well as providing our data science team capabilities that will allow them to iterate very quickly throughout the ML experimentation lifecycle: data cleansing, feature engineering, training, predict/classify, tune, and repeat. Our ambitions are around reaching economies of scale via our platform.