The QWS Dataset

 
 
About QWS Dataset

The main goal of this dataset is to offer a basis for Web Service researchers. To this end, we have collected 5,000 web services and performed various measurements on this dataset. We are pleased to provide a subset of 365 real web service implementations that exist on the Web today. The services were collected using our Web Service Crawler Engine (WSCE). The majority of Web services were obtained from public sources on the Web including Universal Description, Discovery, and Integration (UDDI) registries, search engines, and service portals. The public dataset consists of 365 Web services each with a set of nine Quality of Web Service (QWS) attributes that we have measured using commercial benchmark tools. Each service was tested over a ten-minute period for three consecutive days.

 
Description of QWS Dataset
Each row in the dataset corresponds to an existing Web service implementation available on the public Web today. We have randomly collected 365 Web services from our service repository and continuously monitored particular service qualities including:

QWS Parameters and Units

ID Parameter  Name Description Units
1 Response Time Time taken to send a request and receive a response ms
2 Availability Number of successful invocations/total invocations %
3 Throughput Total Number of invocations for a given period of time invokes/second
4 Successability Number of response / number of request messages %
5 Reliability Ratio of the number of error messages to total messages %
6 Compliance The extent to which a WSDL document follows WSDL specification %
7 Best Practices The extent to which a Web service follows WS-I Basic Profile %
8 Latency Time taken for the server to process a given request ms
9 Documentation Measure of documentation (i.e. description tags) in WSDL %
10 WsRF Web Service Relevancy Function: a rank for Web Service Quality %
11 Service Classification Levels representing service offering qualities (1 through 4) Classifier
12 Service Name Name of the Web service None
13 WSDL Address Location of the Web Service Definition Language (WSDL) file on the Web None

In order, the number associated with each property corresponds to a column within the QWS dataset.

 
Definitions
 
 

Documentation

  One of the main properties of Web services is having proper documentation. The documentation QWS property provides a measure to the extent of which a Web service is self-describable and is based on examining WSDL documents including service name, description, operation name, description, message name, and message description tags.
 
 

Quality of Web Service (QWS)

  Web service’s ability to provide selective treatment to various clients in the most effective manner.
 
 

Web Service Relevancy Function (WsRF)

  WsRF is used to measure the quality ranking of a Web service based on quality metrics (1 through 9 above).
 
 

Service Classification

  The service classification represents various levels of service offering qualities. There are four service classifications:
  1. Platinum (High quality)
  2. Gold
  3. Silver
  4. Bronze (Low quality)

The classification is based on the overall quality rating provided by our WsRF. Using WsRF values obtained for each Web services, we use a classification scheme to associate each Web services to a particular service group. The classification can be helpful to differentiate between various services that offer the same functionality.

 
Download Instructions
The QWS dataset is available free of charge for educational and non-commercial purposes. In exchange, we kindly request that you make available to us the results of running the QWS dataset. Please use the following references in citing the dataset:
  • Al-Masri, E., and Mahmoud, Q. H., "Discovering the best web service", (poster) 16th International Conference on World Wide Web (WWW), 2007, pp. 1257-1258.
  • Al-Masri, E., and Mahmoud, Q. H., "QoS-based Discovery and Ranking of Web Services", IEEE 16th International Conference on Computer Communications and Networks (ICCCN), 2007, pp. 529-534.

Downloading and using the QWS Data will indicate your acceptance to enter into a GNU General Public License agreement. Should the QWS Data be used in any scientific or educational study/research the authors will be accredited as the source of the data with any of the references listed above in citing the data. Redistribution of this data to any other third party or on the Web is not permitted.

To get a copy of the QWS dataset and sample code for the following demos, please send your request via email to ealmasri[AT]uoguelph.ca or qmahmoud[AT]uoguelph.ca.

 
Demos
We have provided sample scripts in Perl to work with our dataset. You can use any platform/language to run our dataset. The dataset consists of a plain text file that contains multiple records. Each record consists of thirteen parameters, as described above, separated by a comma delimiter. You can parse the text file using any language (i.e. ASP, ASP.NET, JSP, C/C++, Java, PHP, etc...). The following demos display a small portion of the QWS dataset.

Display Demo: A sample program written in Perl that parses through the dataset and displays a portion of the results in HTML table.

Search Services Demo: A sample program written in Perl that searches through the contents of the dataset to display the corresponding Web service. The results are displayed in descending ranking order based on our WsRF values (or quality rating).

 
Applications
  • Artificial Neural Network (ANN) for using the Service Classification as input to the network (identify high quality Web services)
  • Web service status: determine using QWS dataset an overview of the existing status of Web services that exist on the Web today.
Update (September 2008)
We are pleased to offer researchers the following updated datasets:
  • An updated QWS Dataset that includes a set of 2,507 Web services and their QWS measurements that were conducted in March 2008 using our Web Service Broker (WSB) framework. Each row in this dataset consists of 11 parameters separated by commas for each Web service. The first nine parameters are QWS parameters measured using Web service benchmark tools over a six-day period. The QWS values represent averages of the measurements collected during that period. The last two parameters represent the service name and reference to the WSDL document.

    Here is an example:
    67.5,86,6,86,73,78,80,1.5,95,check,http://ws.cdyne.com/spellchecker/check.asmx?wsdl

  • The QWS-WSDL Dataset, which includes a set of 2,507 Web service interfaces (WSDL documents) that were collected using our WSCE crawler. The dataset includes an index file that contains two parameters for each Web service: a nine-digit number representing WSDL filename (with extension .wsdl) and a reference to the Web service interface where the WSDL document is located on the Web.

    Here is an example:
    661887839.wsdl,http://www.xignite.com/xDataSet.asmx?wsdl

The following paper has more detailed information about WSCE and the status of Web services on the Web:

Al-Masri, E., and Mahmooud, Q.H.: Investigating Web Services on the World Wide Web, 17th International Conference on World Wide Web (WWW), Beijing, April 2008, pp. 795-804. (Nominated for Best Student Paper Award).

If you'd like to use the updated datasets, please contact us.

 
Contact Us
Your comments and suggestions are welcome. Please send your comments by email: Eyhab Al-Masri (ealmasri[AT]uoguelph.ca) or Qusay H. Mahmoud (qmahmoud[AT]uoguelph.ca)
 


This page is maintained by Eyhab Al-Masri (
ealmasri[AT]uoguelph.ca) and Qusay H. Mahmoud (qmahmoud[AT]uoguelph.ca). Last modified October 2008.