aboutsummaryrefslogtreecommitdiff
path: root/bigtop-bigpetstore/bigpetstore-data-generator/README.md
blob: 1acfe907b928b8f78bf8f7704a1dc3642ea2050e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
BigPetStore Data Generator
==========================

BigPetStore ...

Data Generator ...

=======
Building and Testing
--------------------
We use the Gradle build system for the BPS data generator so you'll need
to install Gradle on your system.
Once that's done, you can use gradle to run the included unit tests
and build the data generator jar.

To build:
    
    $ gradle build

This will create several directories and a jar located at:
    
    build/libs/bigpetstore-data-generator-0.9.0-SNAPSHOT.jar

Building automatically runs the included unit tests.  If you would prefer
to just run the unit tests, you can do so by:

    $ gradle test


To clean up the build files, run:

    $ gradle clean


Running the Data Generator
--------------------------
The data generator can be used as a library (for incorporating in
Hadoop or Spark applications) or using a command-line interface.
The data generator CLI requires several parameters.  To get 
descriptions:

    $ java -jar build/libs/bigpetstore-data-generator-0.9.0-SNAPSHOT.jar

Here is an example for generating 10 stores, 1000 customers, 100 purchasing models,
and a year of transactions:

    $ java -jar build/libs/bigpetstore-data-generator-0.9.0-SNAPSHOT.jar generatedData/ 10 1000 100 365.0


Groovy Drivers for Scripting
----------------------------
Several Groovy example script drivers are included in the `groovy_example_drivers` directory.
Groovy scripts can be used to easily call and interact with classes in the data generator
jar without having to create separate Java projects or worry about compilation.  I've found
them to be very useful for interactive exploration and validating my implementations
when unit tests alone aren't sufficient.

To use Groovy scripts, you will need to have Groovy installed on your system.  Build the 
data generator as instructed above.  Then run the scripts in the `groovy_example_drivers`
directory as so:

    $ groovy -classpath ../build/libs/bigpetstore-data-generator-0.9.0-SNAPSHOT.jar MonteCarloExponentialSamplingExample.groovy