CrystalNet: Faithfully Emulating Large Production Networks

Hongqiang Harry Liu, Yibo Zhu, Jitu Padhye, Jiaxin Cao, Sri Tallapragada, Nuno P. Lopes, Andrey Rybalchenko, Guohan Lu, Lihua Yuan

 

Abstract:

Network reliability is critical for large clouds and online service providers like Microsoft. Our network is large, heterogeneous, complex and undergoes constant churns. In such an environment even small issues triggered by device failures, buggy device software, configuration errors, unproven management tools and unavoidable human errors can quickly cause large outages. A promising way to minimize such network outages is to proactively validate all network operations in a high-fidelity network emulator, before they are carried out in production. To this end, we present CrystalNet, a cloud-scale, high-fidelity network emulator. It runs real network device firmwares in a network of containers and virtual machines, loaded with production configurations. Network engineers can use the same management tools and methods to interact with the emulated network as they do with a production network. CrystalNet can handle heterogeneous device firmwares and can scale to emulate thousands of network devices in a matter of minutes. To reduce resource consumption, it carefully selects a boundary of emulations, while ensuring correctness of propagation of network changes. Microsoft's network engineers use CrystalNet on a daily basis to test planned network operations. Our experience shows that CrystalNet enables operators to detect many issues that could trigger significant outages.

 

Published:

H. H. Liu, Y. Zhu, J. Padhye, J. Cao, S. Tallapragada, N. P. Lopes, A. Rybalchenko, G. Lu, L. Yuan. CrystalNet: Faithfully Emulating Large Production Networks. In Proc. of the 26th Symposium on Operating Systems Principles (SOSP), Oct. 2017.

 

Download:

 

Bibtex:

@inproceedings{crystalnet-sosp17,
  title =	{{CrystalNet}: Faithfully Emulating Large Production Networks},
  author =	{Hongqiang Harry Liu and Yibo Zhu and Jitu Padhye and Jiaxin Cao and Sri Tallapragada and Nuno P. Lopes and Andrey Rybalchenko and Guohan Lu and Lihua Yuan},
  booktitle =	{Proc. of the 26th Symposium on Operating Systems Principles (SOSP)},
  doi =		{10.1145/3132747.3132759},
  month =	oct,
  year =	2017
}

 

Copyright notice:

© ACM, 2017. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution.

 

<-- Return