The following trace contains AWS reserved instance marketplace (RIM) data from every AWS region for every VM type at 30 minute frequency starting from September-2018 to May-2020. Data collection is ongoing, so check this page for future updates.
AWS-Reserved-Marketplace-Data.zip
Note that we cannot track actual sales of reserved instances in RIM versus listing cancellations, but can only observe when listings come on and go off the market.
All traces are in CSV format. Data have been collected with simple python script using EC2’s Boto3 API (specifically, used describe-reserved-instances-offerings API call to collect the data). Check AWS documentation for further details on data fields ([https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-reserved-instances-offerings.html](https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-reserved-instances-offerings.html)) recorded in the data.
Our HotCloud’20 paper No Reservations: A First Look at Amazon's Reserved Instance Marketplace, provides an initial analysis of the RIM data. Please cite this paper when making use of the data.
Job trace includes 14M jobs from a production high performance computing cluster consisting of 14,376 cores. The cluster is the University of Massachusetts (UMass) System Shared Cluster, and is available for general use to researchers from all five campuses in the UMass system, including its medical school. The cluster is located at the Massachusetts Green High Performance Computing Center (MGHPCC), a 15MW data center in Holyoke, Massachusetts that also hosts computing infrastructure Boston University, Harvard, MIT, and Northeastern. The cluster runs the LSF job scheduler, and the included trace is production log from the year 2016.
Each job entry in trace includes its submission time, user ID, maximum running time limit, requested number of cores and memory, and running time. Trace is formatted in hierarchical data format (hdf format). Note that, hdf format is supported by popular programming languages like C++, Java, Python etc using either third party APIs or respective language in-built APIs.
Please find the description of the important fields in the trace below:
Our SC’20 paper Waiting Game: Optimally Provisioning Fixed Resources for Cloud-Enabled Schedulers, provides an analysis of the batch trace. Please cite this paper when making use of the data.
We implemented trace-driven job simulator in python that mimics a cloud-enabled job scheduler, which can acquire VMs on-demand to service jobs. The simulator uses a FCFS scheduling policy, and also implements each of our waiting policies. We have implemented NJW (no jobs waiting), AJW (all jobs waiting), AJW-T (all jobs waiting threshold), SWW (short waits wait), LJW (Long jobs waiting), and Compound policy.
Github link: https://github.com/sustainablecomputinglab/waitinggame