What is PICS?
PICS is a Public IaaS Cloud Simulator and is designed for evaluating the performance of both public IaaS clouds and cloud applications without actual deployment of cloud applications.
- Main Capabilities:
- Assessing a wide range of properties of cloud services and cloud application, including the cloud cost, job response time, and resource utilizations.
- Allowing PICS users to specify different workloads types, including dynamic job arrival patterns and SLA requirements (e.g. Job Deadline).
- Simulating of a broad range of resource management policies: i.e., horizontal and vertical auto scaling, custom job scheduling policies, and job failure cases.
- Enabling PICS users to evaluate the performance of different types of current public IaaS cloud configurations such as a variety of resource types (VM instances and storage services), unique billing models, and performance uncertainty.
PICS Overview
- Design Goal
- Precise modeling of the behavior of public cloud providers (including a variety of cloud resources. e.g., VM, storage, and network).
- Precise modeling of the behavior of public cloud application (including dynamic workload changes and performance uncertainty).
- Precise modeling of the behavior of cloud users' resource management policies.
- Simulation Inputs (PICS Configurations) PICS inputs consist of three configurations files.
- General PICS configurations (./config/config.txt)
- Simulator Configurations
- Public IaaS Configurations (Billing, VM, Storage, and Network)
- Job Management Configurations (Scheduling, Failure Management)
- VM Management Configurations (VM Selection, Scaling)
- VM configurations (./config/vm_config.txt)
- VM Type and Price
- VM CPU and Network Performance
- Workload configurations (./workload/workload.csv)
- Job Arrival, Deadline, Duration, Data Usage Information
- Simulator Configurations
- Public IaaS Configurations
- Job Management Configurations
- VM Management Configurations
- Workload file example
The goal of PICS is to precisely simulate the behaviors of public IaaS clouds the cloud users' perspectives as if they deply a particular cloud application on public IaaS clouds.
Design Challenges:read more PICS descriptions...
General PICS Configurations Detail (./config/config.txt)
Workload Configurations Detail (./workload/workload.csv)
# A. Simulation Configuration
# A.1 Sim Trace Interval (e.g. every 60 sec)
SIM_TRACE_INTERVAL=60
# A.2 Workload File Path
WORK_LOAD_FILE=workload.csv
# A.3 VM Configuration Path
VM_CONFIG_FILE=vm_config.txt
# B. Public IaaS Configuration
# B.1 VM Billing Config - Set IaaS Pricing Model (Hour or Min-based)
VM_BILLING_TIME_UNIT=BTU_HOUR
# BTU_HOUR: Hourly Billing Model (e.g. Amazon Web Services)
# BTU_MIN : Minutely Billing Model (e.g. MS Azure & Google Compute Engine)
# B.1 VM Billing Conf - Set Billing Time Period (Int and > 0)
VM_BILLING_TIME_PERIOD=1
# IaaS Billing Model = VM_BILLING_TIME_UNIT * VM_BILLING_TIME_PERIOD=1
# e.g. VM_BILLING_TIME_UNIT=BTU_MIN and VM_BILLING_TIME_PERIOD=30
# --> 30min based billing
# Note that Unit Cost($) for each VM is defined at VM_CONFIG_FILE
# B.2 VM Startup Delay (Lagtime) - Min Startup LagTime for Creating a new VM
MIN_STARTUP_LAG=30
# B.2 VM Startup Delay (Lagtime) - Max Startup LagTime for Creating a new VM
MAX_STARTUP_LAG=60
# PICS determines startup lagtime for creating a new VM
# from MIN_STARTUP_LAG (>=0) to MAX_STARTUP_LAG (>=0)
# (MAX_STARTUP_LAG >= MIN_STARTUP_LAG)
# C. Cloud Storage Configurations
# C.1 Max Volumn of Cloud Storage
# Unit: Mega Bytes
MAX_CAPACITY_OF_STORAGE=10240000
# C.2 Storage Usage Cost ($) for Gigabytes/Month
# STORAGE_UNIT_COST > 0
STORAGE_UNIT_COST=0.1
# C.3 Storage Billing Time Unit (Second, > 0)
# 1 : 1sec
# 60 : 1 min
# 3600 : 1 Hour
# 86400 : 1 Day
# 2592000: 1 Month
STORAGE_BILLING_TIME_UNIT=1
# D. Network Configuration
# D.1 Network Bandwidth for Data Transfer (Unit: MB/s)
# D.1.1 Bandwidth from Cloud to Cloud
PERF_DATA_TRANSFER_CLOUD=3.0
# D.1.2 Bandwidth from Incoming Traffic
PERF_DATA_TRANSFER_IN=2.0
# D.1.3 Bandwidth for Outgoing Traffic
PERF_DATA_TRANSFER_OUT=1.0
# D.2 Network Cost for Data Transfer (Unit:$)
# D.2.1 Network Cost from Cloud to Cloud
COST_DATA_TRANSFER_CLOUD=0.01
# D.2.2 Network Cost for Incmoing Traffic
COST_DATA_TRANSFER_IN=0.02
# D.2.3 Network Cost for Outgoing Traffic
COST_DATA_TRANSFER_OUT=0.03
# E. Job Management Configurations
# E.1 Job Scheduling Configuration
JOB_ASSIGNMENT_POLICY=EDF
# E.2 Job Failure Configurations
# E.2.1 Probability for Job Failure Occurance
# 0 <= PROB_JOB_FAILURE <= 1
# e.g. 0.05: 5%
PROB_JOB_FAILURE=0.05
# E.3 Job Failure Recovery Policy
# JF-POLICY-01: ignore the failed job
# JF-POLICY-02: re-execute the failed job
# JF-POLICY-03: move the failed job to end of the job queue
# JF-POLICY-04: find another VM (running or new) to satisfy
# the failed job deadline
JOB_FAILURE_POLICY=JF-POLICY-01
# F. VM Management Configuration
# F.1 VM Selection Policy for VM Scaling-up
# VM-SEL-COST : Cost based VM Selection
# VM-SEL-PERF : Performance based VM Selection
# VM-SEL-COSTPERF : Cost/Performance Balanced VM Selection
VM_SELECTION_METHOD=VM-SEL-COST
# F.2 Max Num of Concurrent VMs
# MAX_NUM_OF_CONCURRENT_VMS is
# > 0 or UNLIMITED
MAX_NUM_OF_CONCURRENT_VMS=UNLIMITED
# F.3 VM Scale Down Policy
# SD-IM : Immediate VM Scale-down when the VM is idle
# SD-HR : Hourly Billing Model-based Scale-Down (e.g. AWS)
# SD-MN : Minutely Billing Model-based Scale-Down (e.g. MS Azure)
# SD-SL : Startup-Lag based Scale-Down
# SD-JAT-MEAN : Mean Job Arrival Rate-based Scale Down
# SD-JAT-MAX : Maximum Job Arrival Rate-based Scale Down
# SD-JAT-MEAN-RECENT: Mean Recent Job Arrival Rate-based Scale Down (kNN)
# SD-JAT-MAX-RECENT : Max Recent Job Arrival Rate-based Scale Down
# SD-JAT-SLR : Simple Linear Regression (JAT)-based Scale Down
# SD-JAT-2PR : Quadratic Regression (JAT)-based Scale Down
# SD-JAT-3PR : Qubic Regression (JAT)-based Scale Down
# SD-JAT-LLR : Local Linear Regression (JAT)-based Scale Down
# SD-JAT-L2PR : Local Quadratic Regression (JAT)-based Scale Down
# SD-JAT-L3PR : Local Qubic Regression (JAT)-based Scale Down
# SD-JAT-WMA : Weighted Moving Average (JAT)-based Scale Down
# SD-JAT-ES : Exponential Smoothong (JAT)-based Scale Down
# SD-JAT-HWDES : Holt-Winters Double Exponential Smoothing (JAT)-based
# SD-JAT-BRDES : Brown's Double Exponential Smoothing (JAT)-based
# SD-JAT-AR : Autoregressive-based Scale Down
# SD-JAT-ARMA : Autoregressive and Moving Average-based Scale Down
# SD-JAT-ARIMA : Autoregressive Integrated Moving Average-based
VM_SCALE_DOWN_POLICY_NAME=SD-IM
# F.4 VM Scale Down Policy Unit
# This configuration is only applicable for
# billing model based Scale Down
# (e.g. SL-HR and SL-MN)
# e.g. VM_SCALE_DOWN_POLICY_NAME=SD-MN and VM_SCALE_DOWN_POLICY_UNIT=10
# --> 10 min based Scale Down
VM_SCALE_DOWN_POLICY_UNIT=1
# F.5 Num of Recent Sample for SD Policies
# This configuration is applicable for RECENT-based SD policies.
# (e.g. SD-JAT-*-RECENT)
VM_SCALE_DOWN_POLICY_RECENT_SAMPLE_CNT=50
# F.6 First Parameter for Timeseries.
# alpha for WMA, ES, HWDES, BRDES: 0 < alpha < 1
# p for AR, ARMA, ARIMA (p >= 0)
VM_SCALE_DOWN_POLICY_PARAM1=0.5
# F.7 Second Parameter for Timeseries.
# beta for HWDES (0 < beta < 1)
# q for ARMA and ARIMA (q >= 0)
VM_SCALE_DOWN_POLICY_PARAM2=0.5
# F.8 Third Parameter for Timeseries.
# d for ARIMA (d >= 0)
VM_SCALE_DOWN_POLICY_PARAM3=2
# F.9 MIN/MAX for Wait Time of VM Scale Down
# These MIN/MAX fields are related to predictive methods such as SD-JAT-SLR.
# To handle wrong prediction results
# --> too short (or negative) or too long wait time
VM_SCALE_DOWN_POLICY_MIN_WAIT_TIME=1
VM_SCALE_DOWN_POLICY_MAX_WAIT_TIME=UNLIMITED
# F.10 Vertical Scaling
# Vertical Scaling - Enable: YES, Disable: No
# When enabling Verticaling, MAX_NUM_OF_CONCURRENT_VMS
# shouldn't be UNLIMITED
ENABLE_VERTICAL_SCALING=NO
# F.11 Vertical Scaling Operation
# VSCALE-UP : Only VScale-up
# (triggered when VM cannot meet deadline for queued jobs)
# VSCALE-DOWN : Only VScale-down
# (triggered when VM meets deadline - find most suitable one
# for queued jobs (e.g. cheapest VM with deadline satisfaction)
# VSCALE-BOTH : Both VScale-up/down
# F.12 Vertical Scaling Options
VERTICAL_SCALING_OPERATION=VSCALE-BOTH
VM Configurations Detail (./config/vm_config.txt)
# Number of VM types used in PICS simulation and n > 0
NO_OF_VM_TYPES=n
# First VM Type Name
VM1_TYPE_NAME=t2.micro
# First VM Unit Price ($)
VM1_UNIT_PRICE=0.1
# First VM CPU Performance
# Used to calculate job duration on VM type
# Less value for CPU factor is better
VM1_CPU_FACTOR=2.0
# First VM Network Performance
# Used to calculate data transfer rate on VM type
# Less value for NET factor is better
VM1_NET_FACTOR=2.0
# Second VM Type Name
VM2_TYPE_NAME=t2.micro
# Second VM Unit Price ($)
VM2_UNIT_PRICE=0.2
# Second VM CPU Performance
VM2_CPU_FACTOR=1.5
# Second VM Network Performance
VM2_NET_FACTOR=1.5
...
# nth VM Type Name
VMn_TYPE_NAME=nth_VM_Type
# nth VM Unit Price ($)
VMn_UNIT_PRICE=1.0
# nth VM CPU Performance
VMn_CPU_FACTOR=1.0
# nth VM Network Performance
VMn_NET_FACTOR=1.0
#job_submit_interval,job_duration,job_deadline,input_data,output_data
100,200,1000,NONE_0,NONE_0
100,200,1500,NONE_0,IC_524288
100,200,2000,NOne_0,OC_524288
100,200,2500,IC_524288,NONE_0
100,200,3000,IC_524288,IC_524288
100,200,3500,IC_524288,OC_524288
100,200,4000,OC_524288,NONE_0
100,200,4500,OC_524288,IC_524288
100,200,5000,OC_524288,OC_524288
read more workload descriptions...
- job_submit_interval: this means job generation interval (unit: PICS simulation clock - second). e.g. 100,200,1000,NONE_0,NONE_0
100,200,1500,NONE_0,IC_524288
First job will be generated at 100 simulation seconds.
Next job will be generated at 200 seconds.
==> (100 seconds after the previous job generation.
Actual job duration on each VM is calculated by
standard duration * each VM's CPU_FACTOR
Actual job duration on a VM (CPU_FACTOR=2.0) is 400 (200 * 2.0)
NONE_0: No input data (size is zero).
IC_xxx: xxx mega bytes of input data, transfer direction: Cloud => Cloud.
OC_xxx: xxx mega bytes of input data, transfer direction: Outside => Cloud.
NONE_0: No output data (size is zero).
IC_xxx: xxx mega bytes of output data, transfer direction: Cloud => Cloud.
OC_xxx: xxx mega bytes of output data, transfer direction: Cloud => Outside.
(*) You can find report files at "Logs/pics_log-YYYY-MM-DD-hh-mm-ss/Report/"
(*) In most cases, the following THREE (*) result files are the most important ones:
- 1.report_simulation_trace_broker.csv
- 3.report_job_complete_report.csv
- 4.report_vm_usage_report.csv
read more simulation resutls...
- 1.report_simulation_trace_broker.csv provides
- real time trace for incoming workloads. (e.g. JOB_RECV(CUMM) and JOB_RECV(UNIT))
- real time trace for workload completion. (e,g, JOB_COMP(CUMM) and JOB_COMP(UNIT))
- real time trace for VM usage status and cost. (e.g. VM_*)
- Meaning of all attributes for 1.report_simulation_trace_broker.csv
- CLOCK: Simulation clock.
- JOB_RECV(CUMM): The accumulated numbers of received jobs until the current simulation clock.
- JOB_RECV(UNIT): The number of received jobs at the simulation clock.
- JOB_COMP(CUMM): The accumulated numbers of completed jobs until the current simulation clock.
- JOB_COMP(UNIT): The number of completed jobs at the simulation clock.
- VM_RUN: The number of currently running (VM_STUP + VM_ACT) VMs including currently starting up VMs and active VMs, and not including stopped VMs (VM_STOP).
- VM_STUP: The number of currently starting up VMs.
- VM_ACT: The number of currently active VMs.
- VM_STOP: The number of currently stopped VMs.
- VM_COST($): The accumulated VM cost at the simulation clock.
- 2.report_simulation_trace_iaas.csv provides
- real time trace for cloud usage and cost.
- real time trace for network usage and cost.
- Meaning of all attributes for 2.report_simulation_trace_iaas.csv
- CLOCK: Simulation clock.
- # SC: The number of storage containers (e.g. S3 buckets).
- # SFO: The number of storage file objects (e.g. total # of files in all S3 buckets).
- ST_SIZE (KB): The current size (Kilo Bytes) of cloud storage (e.g. S3) at the simulation clock.
- ST_COST ($): The current cost of cloud storage (e.g. S3) at the simulation clock.
- NET-IN (KB): The amount of data transmission (outside of clouds --> clouds (IaaS data center)) at the simulation clock.
- NET-OUT (KB): The amount of data transmission (clouds (IaaS data center)--> outside of clouds) at the simulation clock.
- NET-CLOUD (KB): The amount of data transmission (clouds <--> clouds in the same data center) at the simulation clock.
- NET-IN_COST ($): The network cost for item #6.
- NET-OUT_COST ($): The network cost for item #7.
- NET-CLOUD_COST ($): The network cost for item #8.
- NET_COST ($): The total cost for network usage: item #9 + item #10 + item #11.
- 3.report_job_complete_report.csv provides
- detailed information for each workload processing.
- CPU time -- e.g., item #5: CPU
- Network time -- e.g., item #4 and #6: IN/OUT
- Deadline satisfaction -- e.g., item #15: DF
- Total duration -- e.g. item #13: TD and item #14: RT
- Cost for each workload processing -- e.g. item #16: CO($)
- Meaning of all attributes for 3.report_job_complete_report.csv
- ID: Job ID.
- JN: Job Name.
- ADR: Actual Job Duration.
- IN: Network time for data transmission to VM before this job processing.
- CPU: CPU time for the job processing.
- OUT: Network time for data transmission from VM to outside of VM after this job processing (e.g. output file data transfer).
- DL: Job deadline.
- VM: Assigned VM ID for this job processing.
- TG: Time for job generation. (job ingress time.)
- TA: Time for job assignment to the particular VM.
- TS: Time for job processing start.
- TC: Time for job processing completion.
- TD: Total duration for job processing. (TC - TG)
- RT: Job runtime. (TC - TS)
- DF: Difference from job deadline
- Positive value: job deadline satisfaction.
- Negative value: job deadline miss.
- CO($): Cost for this job processing.
- ST: Job state.
- JOB_ST_COMPLETED (3004): this job is successfully completed.
- JOB_ST_FAILED (3005): this job is failed.
- 4.report_vm_usage_report.csv provides
- detailed information for VM usage.
- VM usage cost -- e.g. item #3: CO($)
- VM running time -- e.g. item #2: RT
- VM utilization -- e.g. item #13: UT
- # of processed jobs -- e.g. item #11: NJ
- Vertical scaling decision -- e.g. item #18, #19, and #20.
- Meaning of all attributes for 4.report_vm_usage_report.csv
- VMID: VM (Virtual Machine) ID.
- RT: VM runtime.
- CO($): VM Cost.
- IID: VM Instance ID.
- TY: VM Type. (e.g. m3.xlarge)
- ST: VM State.
- VM_ST_CREATING (3101): This VM is currently creating.
- VM_ST_ACTIVE (3102): This VM is currently running (active).
- VM_ST_TERMINATE (3103): This VM is terminated.
- TC: Time for VM is created.
- TA: Time for VM is activated.
- TT: Time for VM is terminated.
- SL: Startup lag time for this VM.
- NJ: The number of jobs processed by this VM.
- JR: Job runtime on this VM.
- UT: VM Utilization (e.g. 0.9 = 90% of utilization)
- SR: Startup portion of total VM running time.
- ID: Idle portion of total VM running time.
- LJCT: Simulation clock for the last job completion on this VM.
- SDWT: Scale down wait time -- wait time before termination of this VM.
- IS_VS_VICTIM: True if this VM is a victim for vertical scaling up. False if this is not eligible for vertical scaling.
- VS_CASE: Case for vertical scaling.
- VS_VICTIM_ID: -1 if this is not related to vertical scaling. if not -1, this VM is vertical scaling case and this field marks the victim of vertical scaling.
- 5.report_storage_usage.csv provides
- detailed information for Cloud Storage usage including storage usage time, cost, and volumn size.
- Meaning of all attributes for 5.report_storage_usage.csv
- If TY is SC: this is information for storage container. (e.g. S3)
- ID: Storage Container ID. (e.g. S3 ID)
- CR: Created Job ID for this storage container.
- TR: An simulation entity that terminates this storage container.
- RG: Storage container region.
- PM: Permission for this storage. (e.g. SC_PERMISSION_PUBLIC (4001): public, SC_PERMISSION_PRIVATE (4002): private, SC_PERMISSION_GROUP (4003): group permission)
- ST: State for the storage container.
- CT: Time for creation.
- DT: Time for deletion.
- DR: Duration for this storage container is active.
- NF: The number of stored files.
- VL(KB): The volume for storage container.
- CO($): Cost for this storage container.
- If TY is SFO: this is information for file object in particular storage container.
- ID: File object ID.
- SC: Storage container ID/
- SZ(KB): File object size (KB).
- ON: File created Job ID.
- ST: File status.
- DST: Data status.
- PSZ(KB): File (planned) size.
- CT: File creation time.
- AT: File activated time.
- DT: File deleted time.
- DR: File active duration.
- CO($): File storage cost.
- If TY is SC: this is information for storage container. (e.g. S3)
- 6.report_network_usage.csv provides
- detailed information for Network usage including network cost for incoming/outgoing data transfer.
- Meaning of all attributes for 6.report_network_usage.csv
- JOBID: Job ID for network usage.
- IN_TS(KB): Input file size.
- IN_DR: Input file flow direction. (e.g., IFTD_IC: input file is from inside clouds, IFTD_OC: input file is from outside of clouds.)
- IN_COST($): Network usage cost for input file.
- OUT_TS_PLANNED(KB): Output file size (planned).
- OUT_TS_ACTUAL(KB): Output file size (actual).
- OUT_COST($): Network usage cost for output file.
- TOTAL_COST($): Total network cost for input/output files (IN_COST($) + OUT_COST($))
PICS Validation Results
In order to validate the correctness of PICS simulation, we have compared PICS with real-world cloud application on Amazon Web Services.
- Baseline Information
- Baseline Public Cloud: Amazon Web Services (focusing on EC2 instances and S3 storage)
- Baseline Cloud Application: Hadoop (MapReduce) Application
- Workload Patterns: Steady, Bursty[1],[2], Poisson-based Random Workload Pattern.
- Job Scheduling: EDF Scheduling
- VM Selection: Cost-based VM Selection
- VM Scaling: Horizontal (Scale-Out/In) and Vertical (Scale-Up/Down) Scaling
- PICS Validation Results
- Cost Traces
- Bursty Workload
- Random(Poisson) Workload
- VM Utilization
- VM Scaling
- Bursty Workload
- Random(Poisson) Workload
- Job Deadline
- Steady Workload
- Bursty Workload
Setup and PICS Execution- Prerequisite Packages
- Run PICS Simulator
$ wget https://github.com/ik2sb/PICS/archive/master.tar.gz or master.zip $ tar xvfz master.tar.gz or unzip master.zip $ cd PICS-master $ // Configure your simulation setting. $ // - General Configuration: ./config/config.txt $ // - VM Configuration: ./config/vm_config.txt $ // - Workload Configuration: ./config/workload.txt $ python run_simulation.py
Project Team Members- In Kee Kim (Computer Science at The University of Georgia)
- Wei Wang (Computer Science at University of Texas at San Antonio)
- Marty Humphrey (Computer Science at University of Virginia)
Publication
- In Kee Kim, Wei Wang, and Marty Humphrey, "PICS: A Public IaaS Cloud Simulator", 8th IEEE International Conference on Cloud Computing (IEEE CLOUD 2015)
Technical Support
If you need any technical support for PICS, please contact In Kee Kim (inkee.kim@uga.edu).