Homebrew is an AI R&D Lab that works on open-source, local AI. We are the creators and maintainers of popular open-source AI tools:
- Jan: Desktop Copilot ( 1 million downloads)
- Cortex: Local, open-source alternative to OpenAI Platform
- Menlo: GPU cloud for model trainers
We are a fully remote company. In the long term, our objective is to train useful, safe AI that helps improve the lives of our children.
Homebrew is looking for an OpenStack Engineer who can help drive the development of our On-Premise GPU Cloud.
Responsibilities
- Design and architect robust, resilient, and scalable OpenStack cloud infrastructure to meet the organization's computing, storage, and networking requirements
- Lead the deployment, configuration, and integration of core OpenStack services including Nova, Neutron, Cinder, Glance, Keystone, Horizon, and Heat
- Automate the provisioning and management of OpenStack environments using tools like Ansible, Puppet, or Heat
- Ensure high availability and fault tolerance across the OpenStack control plane and compute/storage resources
- Monitor and troubleshoot issues within the OpenStack environment, and implement proactive measures to maintain optimal performance
- Collaborate with the network, storage, and security teams to integrate OpenStack with existing infrastructure
- Develop and document standard operating procedures for deploying, upgrading, and maintaining the OpenStack environment
- Provide technical guidance and support to the cloud operations team
- Stay up-to-date with the latest OpenStack releases and roadmap, and evaluate new features and capabilities for potential adoption
Requirements
- Experience designing, deploying, and managing large-scale OpenStack cloud infrastructure
- Extensive knowledge of OpenStack architecture, components, and deployment best practices
- Proficiency in configuring and integrating core OpenStack services (Nova, Neutron, Cinder, Glance, Keystone, Horizon, Heat)
- Experience with OpenStack high availability and fault tolerance mechanisms
- Strong scripting and automation skills using tools like Ansible, Puppet, or Heat
- Familiarity with cloud networking concepts (VLANs, routing, load balancing, security groups)
- Experience with cloud storage technologies (Ceph, Swift, Cinder, etc.)
- Excellent troubleshooting and problem-solving skills
- Strong communication and collaboration skills to work effectively with cross-functional teams
- Familiarity with container technologies (Docker, Kubernetes) and their integration with OpenStack
- Experience with cloud monitoring and logging tools (Nagios, Prometheus, Grafana)
Benefits
- We pay an âall-inâ pay and you will cover your own insurance/medical from the amount.
- 14 days leave (and unlimited sick days).
- Annual equipment budget (once 2-month probation has been completed).