Skip to main content

Posts

Showing posts from 2018

Creating an OpenStack development environment with an existing External Network via Packstack

Packstack provides a very simple, and very well automated process for reading development environments for OpenStack. I’d like to document here just some reproducible steps I’ve been using to set up these sorts of environments. The process for running this on a single node is very straightforward: http://haidv204.blogspot.com/2018/06/how-to-install-openstack-using-rdo.html and expanding this setup to multiple nodes is similarly straightforward; you can replace the flag --allinone with  --install-hosts=${controller_node_ip},${compute_node_1_ip},${compute_node_2_ip}... and it fires off a multi-node setup. This, however, does not assume you have external networks or other parts of your network outside of OpenStack you’d like these resources to be able to connect to. The servers in use in my environment are: Controller:  16GB RAM / 100GB Disk, 8 vCPUs  Note:  This will work with much fewer resources. I deployed this into a public cloud — for this reason, I al...

How to Install OpenStack Using RDO Packstack

Prerequisites for Packstack Packstack is based on OpenStack Puppet modules. It’s a good option when installing OpenStack for a POC or when all OpenStack controller services are installed on a single node. Packstack defines OpenStack resources declaratively and sets reasonable default values for all settings that are essential to installing OpenStack. The settings can be read or modified in a file, called the  answer file  in Packstack. Packstack runs on RHEL 7 or later versions and the equivalent version for CentOS. The machine where Packstack will run needs at least 4GB of memory, at least one network adapter and x86 64-bit processors with hardware virtualization extensions. Install RDO Repository To install OpenStack, first, download the RDO repository rpm and install it. On RHEL $ sudo yum install -y https://rdoproject.org/repos/rdo-release.rpm On CentOS $ sudo yum install -y centos-release-openstack-mitaka Install OpenStack Install the Packs...

Merge AVHDX Hyper-V Checkpoints

When you create a snapshot of a virtual machine in Microsoft Hyper-V, a new file is created with the  .avhdx  file extension. The name of the file begins with the name of its parent VHDX file, but it also has a GUID following that, uniquely representing that checkpoint (sometimes called snapshots). You can see an example of this in the Windows Explorer screenshot below. Creating lots of snapshots will result in many  .avhdx  files, which can quickly become unmanageable. Consequently, you might want to merge these files together. If you want to merge the  .avhdx  file with its parent  .vhdx  file, it’s quite easy to accomplish. PowerShell Method Windows 10 includes support for a  Merge-VHD  PowerShell command, which is incredibly easy to use. In fact, you don’t even need to be running PowerShell “as Administrator” in order to merge VHDX files that you have access to. All you need to do is call  Merge-VHD  with the...

CI/CD with CircleCI - Heroku deploy

Note In this post, we'll deploy a Flask app to Heroku. Any commit to Github, the CircleCI will be triggered and test will be performed. If the test finished successfully, the CircleCI will deploy our app to Heroku. Signup by authorizing Github First, sign up for CircleCI. We can login to CircleCI platform via Github by allowing access to the repo. At the click on "Authorize application", we'll have welcome screen with a list of our Github repositories: Installing and running locally We'll use the following source in GitHub:  circleci-heroku . Here are the steps to install and run the test: Clone the repo and  cd circleci-heroku . Setup virtualenv :  virtualenv venv  and then  source venv/bin/activate . Run  pip install -r requirements.txt  (preferably inside a virtualenv) to install the dependencies. To run the "hello" app locally: (venv) k@laptop:~/TEST/circleci-heroku$ python hello/hello_app.py * Running on http...

Scaling Kubernetes to 2,500 Nodes

We’ve been running  Kubernetes  for deep learning research for over two years. While our largest-scale workloads manage bare cloud VMs directly, Kubernetes provides a fast iteration cycle, reasonable scalability, and a lack of boilerplate which makes it ideal for most of our experiments. We now operate several Kubernetes clusters (some in the cloud and some on physical hardware), the largest of which we’ve pushed to over 2,500 nodes. This cluster runs in Azure on a combination of D15v2 and NC24 VMs. On the path to this scale, many system components caused breakages, including etcd, the Kube masters, Docker image pulls, network, KubeDNS, and even our machines’ ARP caches. We felt it’d be helpful to share the specific issues we ran into, and how we solved them. etcd After passing 500 nodes in our cluster, our researchers started reporting regular timeouts from the  kubectl  command line tool. We tried adding more Kube masters (VMs running  kube-apiserv...