aboutsummaryrefslogtreecommitdiffstats
path: root/resources/tools/testbed-setup
AgeCommit message (Collapse)AuthorFilesLines
2021-04-08Ansible git movepmikus207-9188/+0
+ Better accessibility + Compliant with fdio.infra._function_ - function [pxe|terraform|ansible|vagrant] + dill==0.3.3 also applied on TBs - ci-man to follow today - Docs to be updated in separate patch Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Iff9eaa29d63044188cc8160db2d9b44b7635782a
2021-04-08Infra: AWS Update to Ubuntu 20.04Tomas Alexy8-22/+37
- Change AMI image to Ubuntu 20.04 - Add calibration role - Add AWS vfio-patch for kernel 5.8+ - Prepare root module's main.tf to be used with environment variables Signed-off-by: Tomas Alexy <tomas.alexy@pantheon.tech> Change-Id: I7db3f28ba573a5a8a1dc07179ef78ef34ce9ebf3
2021-03-26ansible: update TX2 VPP Device hugepages and ifsJuraj Linkeš3-11/+5
These interfaces are not used, so remove them until we actually use them. Use less amount of hugepages since we don't need as much and the rest of the memory is more useful for build/host processes. Change-Id: I52b2d6e2812e5cadeab9e51a1bae3688794f414a Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2021-03-18Infra: Shared TG Ansible rulespmikus2-4/+6
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I7d505d99003c4ab48b191c1d534513768d03bc83
2021-03-17ansible: arm perf TG updateJuraj Linkeš1-2/+3
Update TG to shared (docker) TG and add hugepages accordingly. Change-Id: I45ece9d1c8d6dbc3174661447ae46a7e28613313 Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2021-03-17Infra: csit-sut temporary buildpmikus27-286/+184
- Untill the issue with fdiotools will be solved Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I3c2b06f37014a0571487d527447d90ceafdf52a9
2021-03-15Infra: Ansible Ubuntu 20.04 follow-upspmikus6-12/+9
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I36a8b4a6cab976f51685df56c0dc2d95c00e248f
2021-03-11Infra: Minor ansible tweakspmikus2-18/+16
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I47de0b78ed64b2426d39c5edf22ec36866609e8e
2021-03-08Infra: AWS/Azure fix OOM issues on smaller instancesTomas Alexy2-8/+8
Signed-off-by: Tomas Alexy <tomas.alexy@pantheon.tech> Change-Id: Ic799f5eeaf03f34386603421c1d9282167c25aa5
2021-03-07Infra: Docker DNS on Nomad hostspmikus7-5/+58
+ Make the host default resolver Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ifadb8901c66b65b2213878180c87370262ab22f0
2021-03-04FIX: Ansible storage bugspmikus4-4/+12
- daily digest Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ifa2dc10831fde0a101c916f2b8abd26abb93cb7f
2021-02-24Infra: Azure - file cleanupTomas Alexy6-72/+78
Signed-off-by: Tomas Alexy <tomas.alexy@pantheon.tech> Change-Id: I8b97123711a76bf8851f6c4997e819d79364b83b
2021-02-19Infra: Ansible Ubuntu Focalpmikus36-257/+388
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I6558938fe4bbdfb5add7a361adb4a12da6b0a6dc
2021-02-18Infra: Fix AWS deploymentTomas Alexy10-101/+117
Signed-off-by: Tomas Alexy <tomas.alexy@pantheon.tech> Change-Id: Ie24f5fac5827e28b1ac7c22192a94994700b2910
2021-02-18Framework: SciPy upgradepmikus1-1/+1
+ 1.5.4 to solve python3.8 dependency. Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I682bedc18b56d1fed3974f792a4d79656cbe97cb
2021-02-17Infra: Ansible docker images cleanuppmikus10-74/+188
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I8d67b8ad5db5c0a7c9b3fa892e1e66dab2f666d0
2021-02-16Infra: Ansible 2.10Peter Mikus9-1/+49
Signed-off-by: Peter Mikus <pmikus@cisco.com> Change-Id: I6b058ff30628c7e066372fec2141a8bcc18c3997
2021-02-10Infra: JenkinsJobHealthExporterpmikus7-0/+115
- Integration of Jenkins Job checker Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I822039cb64a3a352b49314ddab7c6099af3fe644
2021-02-04Infra: Move probes under ansible instead of terraformpmikus21-69/+271
+ More stable probe handling. + Naming cleanup due to errors. Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I3bb1237af20636919f869f2eee53597202d00792
2021-02-03Ansible: Fix package cache updatesTomas Alexy14-15/+86
Signed-off-by: Tomas Alexy <tomas.alexy@pantheon.tech> Change-Id: I5c7b2636bde999fef3a60e6cbf2b36db9978a74a
2021-01-29Infra: Monitoring capabilitypmikus2-2/+15
+ Monitoring SOA + Nomad alertmanager job + Nomad prometheus job + Nomad grafana job Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I0b32e9c87276ba1a2d4a5322816f3473c737eae2
2021-01-28Infra: Remove Consul TLS on clients (Nomad conflict)pmikus1-3/+3
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I7c825150a19dd783a255fcc5cbd31b91c6b0b2cf
2021-01-28Infra: Adjust vpp_device x86 memory layoutpmikus2-2/+2
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I43b3b891b270903419b7fcf132813563306b6e10
2021-01-28Infra: Align Nomad settings across clusterpmikus13-5/+18
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Id362e47ecee9fd4eac8332978d81f33656880d66
2021-01-07ansible: remove unused old ARM nomad hostsDave Wallace3-64/+1
Signed-off-by: Dave Wallace <dwallacelf@gmail.com> Change-Id: Ie2a653fa46119b5971c58478e306920376a8f874
2020-12-17Ansible: Fix cleanup procedurespmikus3-12/+34
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ib0c3a508b32a4d5929cfc20a7a7813752350b7d9
2020-12-12Refactor storage solutionpmikus5-0/+63
+ Minio terraform module + XL mode enabled with erasure code + Upload script as a sample + Nginx terraform module + Updating ansible to reflect changes Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ia8c439b749aa0de82bd6f1d0cfbecce4d7000a8f
2020-12-07Ansible: Enable consul TLSpmikus24-652/+786
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ia53acc4441087e93a51d87097adea0b220d10144
2020-12-04Terraform: csit-shim refactorpmikus6-0/+212
- remove snergster image dependency Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I76fef60371e35dddc6da56db5f9207e003d1c792
2020-12-02Terraform: Nomad resource definitionspmikus18-72/+21
+ storage - final until more ssd arive. + nginx - final + vpp_device - untested yet (restored from EdK setups) - to be rewritten Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ib9499fc8cfb0d9f5c5d5bbd1ccd856ecc951ec2a
2020-11-30ansible: remove yul2 hosts from nomad server poolDave Wallace3-3/+3
Signed-off-by: Dave Wallace <dwallacelf@gmail.com> Change-Id: Ibcbd95408fb4859a13c7f2659a9e15c5498b788b
2020-11-30Ansible: Final consul.d fixespmikus18-69/+47
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I2b5f2d090ac752c85508030c4dfe206023f1184f
2020-11-26Ansible: Hashicorp Consulpmikus35-3/+887
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I56987d744d9143a95954d85f2557cda07220c681
2020-11-24ansible: update 3n-tsh hugepagesJuraj Linkeš2-4/+6
Fix "Not enough availablehuge pages: 1483!". Also update 3n-tsh docs. Change-Id: I1d37a66af1e2363f77fdbd87d238e8ff5535b011 Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2020-11-24lab: ThunderX2 updatesJuraj Linkeš2-6/+6
Update after switching one 1n-tx2 and the idle ThunderX2 servers between racks. Update the idle ThunderX2 specs to a new perf testbed, 2n-tx2. Add Server-Type-B12 which is a modified Server-Type-B2 with one extra NIC (needed for 2n-tx2). Change-Id: I51af358f1feb476652eddfe82b5af1d0d70ac259 Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2020-11-18Ansible: Docker updatepmikus4-10/+17
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ie48a96d83d37d7292d261875371e09d4b9152c7b
2020-11-18T-Rex: 2.86pmikus6-79/+73
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Id56b87ab868f2897a6563914b0beca2acc25e706
2020-11-16Ansible: Remove vpp_device snergster dependencypmikus2-1/+3
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I145a4b5511141f1e2b4e387daa358e32dd2c8015
2020-11-11Ansible: Remove vpp_device snergster dependencypmikus4-0/+226
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Id14c3f2f4f8689256172dc2b3ebc4fbaed5de8d3
2020-10-20FIX: Ansible calibrationpmikus36-36/+0
Force check is doing its job but vt.handoff was deployed later. vt.handoff (vt = virtualterminal) is a kernel boot parameter unique to Ubuntu, and is not an upstream kernel boot parameter. Its purpose is to allow the kernel to maintain the current contents of video memory on a virtual terminal. So, when the operating system is booting up, when it moves past the boot loader, vt.handoff allows showing of an aubergine background, with Plymouth displaying a logo and progress indicator bar on top of this. Once the display manager comes up, it smoothly replaces this with a login prompt. Useless Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I9d6db8833ccbef680fab2b643e9c3525bc709244
2020-10-20FIX: Ansible assertsPeter Mikus40-255/+256
Change-Id: Ib668674a2a267d2ceed458288d21181b2a937778 Signed-off-by: Peter Mikus <pmikus@cisco.com>
2020-10-15Ansible: Rework grub command linepmikus46-20/+493
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I8abfc47e5e816e2ab4b39d7ad7575e672ae19ca6
2020-10-12Infra: Enable AMDpmikus2-8/+8
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I3954157e91aafd370c3ac0843708903d19b84936
2020-10-07vpp_device: updates for 1n-tx2 testbedsJuraj Linkeš7-31/+121
One ThunderX2 9975 server (.69) was replaced with two ThunderX2 9980 (.70, .71) servers. Move the .69 server under ansible perf section in anticipation of repurposing it for that purpose. Update the ansible scripts with .70 and .71 config and rename port names in device.sh lib to reflect the NIC differences between .69 and .70 (and .71). Change-Id: I88b75648735243e5559175d3192ffcc8fc70071c Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2020-09-23FIX: Mellanox handlingpmikus1-0/+1
- From 2n-zn2 testing Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I9b1f0916f0f1d90a223918cfe48409d29f2ee773
2020-09-18Ansible: Add EPYCpmikus3-0/+36
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ic84b7cbcbcd269a5fe548b59d478cd6dfa57a952
2020-09-03Framework: Bump DPDK 20.08pmikus2-0/+8
+ DPDK 20.08 + Migrate make -> meson + Fix all trending issues Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I31dcb22627c0f8d17ec63c5b138a2da958b006f4
2020-09-01Ansible: fix the wrong module and mode for vpp_device.yaml in cleanup roleJieqiang Wang1-2/+2
The src field for ansible template module used in vpp_device.yaml of cleanup role should be jinja2 format and the mode for script to be transfered to the remote host should be executable for the file owner. Fix this error by replacing the ansible template module with ansible copy module and setting the file permission to be 744. Change-Id: Ibf80b0c5bec77a13509122795a5b12b6faba2f8e Signed-off-by: Jieqiang Wang <jieqiang.wang@arm.com>
2020-08-28Ansible: Add arm nomads into poolpmikus1-2/+2
Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I99a33a15e7e87fe6f56deb1ab0e4ce3091508244
2020-08-24T-Rex: 2.82, core pin, 8 workerspmikus2-1/+2
+ Bump T-Rex version. We need new features for ASTF test. + Apply core pining. Results in a more stable performance. + Tweak the number of T-Rex workers. + We need an even value to achieve ymmetric performance with pinning. + Value 8 was selected as a best compromise. This is a combination of 3 commits. This is the 1st commit message: T-Rex: 2.82 This is the commit message #2: Change Trex to CORE_MASK_PIN mode to improve performance https://trex-tgn.cisco.com/trex/doc/trex_stateless.html#_core_masking_per_interface Above link have below explaination, "When the profile is symmetric, performance can be improved by pinning half of the cores to port 0, and half of the cores to port 1, thus avoiding cache trashing and bouncing." The reason to change this is that to run CSIT with 100G NIC often failed with "TRex stateless runtime error timeout", it caused by Trex can not send enough traffic within the fixed duration. by change to CORE_MASK_PIN mode fix the issue. Not editing ASTF, as that supports different options. This is the commit message #3: Experiment: Vary number of TRex workers With CORE_MASK_PIN, we can get more predictable time distribution. Decided to use 8 workers, that gives good results both for high end (RDMA-core l2patch) and low end (vhost) tests. Change-Id: I5c61127799e0624464e960fcb980ad1c4058e744 Signed-off-by: pmikus <pmikus@cisco.com> Signed-off-by: Yulong Pei <yulong.pei@intel.com> Signed-off-by: Vratko Polak <vrpolak@cisco.com>