Current advanced cloud infrastructure management solutions allow scheduling actions for dynamically changing the number of running virtual machines (VMs). This approach, however, does not guarantee that the scheduled number of VMs will properly handle the actual user generated workload, especially if the user utilization patterns will change. We propose using a dynamically generated scaling model for the VMs containing the services of the distributed applications, which is able to react to the variations in the number of application users. We answer the following question: How to dynamically decide how many services of each type are needed in order to handle a larger workload within the same time constraints? We describe a mechanism for dynamically composing the SLAs for controlling the scaling of distributed services by combining data analysis mechanisms with application benchmarking using multiple VM configurations. Based on processing of multiple application benchmarks generated data sets we discover a set of service monitoring metrics able to predict critical Service Level Agreement (SLA) parameters. By combining this set of predictor metrics with a heuristic for selecting the appropriate scaling-out paths for the services of distributed applications, we show how SLA scaling rules can be inferred and then used for controlling the runtime scale-in and scale-out of distributed services. We validate our architecture and models by performing scaling experiments with a distributed application representative for the enterprise class of information systems. We show how dynamically generated SLAs can be successfully used for controlling the management of distributed services scaling.