diff options
Diffstat (limited to 'bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager')
7 files changed, 171 insertions, 152 deletions
diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md index 02508817..430cc974 100644 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/README.md @@ -14,142 +14,170 @@ See the License for the specific language governing permissions and limitations under the License. --> -## Overview +# Overview The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. -This charm deploys the ResourceManager component of the Apache Bigtop platform -to provide YARN master resources. +This charm deploys the ResourceManager component of the [Apache Bigtop][] +platform to provide YARN master resources. +[Apache Bigtop]: http://bigtop.apache.org/ -## Usage -This charm is intended to be deployed via one of the -[apache bigtop bundles](https://jujucharms.com/u/bigdata-dev/#bundles). -For example: +# Deploying - juju deploy hadoop-processing +A working Juju installation is assumed to be present. If Juju is not yet set +up, please follow the [getting-started][] instructions prior to deploying this +charm. -> Note: With Juju versions < 2.0, you will need to use [juju-deployer][] to -deploy the bundle. +This charm is intended to be deployed via one of the [apache bigtop bundles][]. +For example: -This will deploy the Apache Bigtop platform with a workload node -preconfigured to work with the cluster. + juju deploy hadoop-processing -You can also manually load and run map-reduce jobs via the plugin charm -included in the bundles linked above: +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, use [juju-quickstart][] with the following syntax: `juju quickstart +hadoop-processing`. - juju scp my-job.jar plugin/0: - juju ssh plugin/0 - hadoop jar my-job.jar +This will deploy an Apache Bigtop cluster with this charm acting as the +ResourceManager. More information about this deployment can be found in the +[bundle readme](https://jujucharms.com/hadoop-processing/). +## Network-Restricted Environments +Charms can be deployed in environments with limited network access. To deploy +in this environment, configure a Juju model with appropriate proxy and/or +mirror options. See [Configuring Models][] for more information. -[juju-deployer]: https://pypi.python.org/pypi/juju-deployer/ +[getting-started]: https://jujucharms.com/docs/stable/getting-started +[apache bigtop bundles]: https://jujucharms.com/u/bigdata-charmers/#bundles +[juju-quickstart]: https://launchpad.net/juju-quickstart +[Configuring Models]: https://jujucharms.com/docs/stable/models-config -## Status and Smoke Test +# Verifying +## Status Apache Bigtop charms provide extended status reporting to indicate when they are ready: - juju status --format=tabular + juju status This is particularly useful when combined with `watch` to track the on-going progress of the deployment: - watch -n 0.5 juju status --format=tabular + watch -n 2 juju status -The message for each unit will provide information about that unit's state. -Once they all indicate that they are ready, you can perform a "smoke test" -to verify HDFS or YARN services are working as expected. Trigger the -`smoke-test` action by: +The message column will provide information about a given unit's state. +This charm is ready for use once the status message indicates that it is +ready with nodemanagers. - juju action do namenode/0 smoke-test - juju action do resourcemanager/0 smoke-test +## Smoke Test +This charm provides a `smoke-test` action that can be used to verify the +application is functioning as expected. This action executes the 'yarn' +smoke tests provided by Apache Bigtop and may take up to +10 minutes to complete. Run the action as follows: -After a few seconds or so, you can check the results of the smoke test: + juju run-action resourcemanager/0 smoke-test - juju action status +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, the syntax is `juju action do resourcemanager/0 smoke-test`. -You will see `status: completed` if the smoke test was successful, or -`status: failed` if it was not. You can get more information on why it failed -via: +Watch the progress of the smoke test actions with: - juju action fetch <action-id> + watch -n 2 juju show-action-status +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, the syntax is `juju action status`. -## Benchmarking +Eventually, the action should settle to `status: completed`. If it +reports `status: failed`, the application is not working as expected. Get +more information about a specific smoke test with: -This charm provides several benchmarks to gauge the performance of your -environment. + juju show-action-output <action-id> -The easiest way to run the benchmarks on this service is to relate it to the -[Benchmark GUI][]. You will likely also want to relate it to the -[Benchmark Collector][] to have machine-level information collected during the -benchmark, for a more complete picture of how the machine performed. +> **Note**: The above assumes Juju 2.0 or greater. If using an earlier version +of Juju, the syntax is `juju action fetch <action-id>`. -[Benchmark GUI]: https://jujucharms.com/benchmark-gui/ -[Benchmark Collector]: https://jujucharms.com/benchmark-collector/ +## Utilities +This charm includes Hadoop command line and web utilities that can be used +to verify information about the cluster. -However, each benchmark is also an action that can be called manually: +Show the running nodes on the command line with the following: - $ juju action do resourcemanager/0 nnbench - Action queued with id: 55887b40-116c-4020-8b35-1e28a54cc622 - $ juju action fetch --wait 0 55887b40-116c-4020-8b35-1e28a54cc622 + juju run --application resourcemanager "su yarn -c 'yarn node -list'" - results: - meta: - composite: - direction: asc - units: secs - value: "128" - start: 2016-02-04T14:55:39Z - stop: 2016-02-04T14:57:47Z - results: - raw: '{"BAD_ID": "0", "FILE: Number of read operations": "0", "Reduce input groups": - "8", "Reduce input records": "95", "Map output bytes": "1823", "Map input records": - "12", "Combine input records": "0", "HDFS: Number of bytes read": "18635", "FILE: - Number of bytes written": "32999982", "HDFS: Number of write operations": "330", - "Combine output records": "0", "Total committed heap usage (bytes)": "3144749056", - "Bytes Written": "164", "WRONG_LENGTH": "0", "Failed Shuffles": "0", "FILE: - Number of bytes read": "27879457", "WRONG_MAP": "0", "Spilled Records": "190", - "Merged Map outputs": "72", "HDFS: Number of large read operations": "0", "Reduce - shuffle bytes": "2445", "FILE: Number of large read operations": "0", "Map output - materialized bytes": "2445", "IO_ERROR": "0", "CONNECTION": "0", "HDFS: Number - of read operations": "567", "Map output records": "95", "Reduce output records": - "8", "WRONG_REDUCE": "0", "HDFS: Number of bytes written": "27412", "GC time - elapsed (ms)": "603", "Input split bytes": "1610", "Shuffled Maps ": "72", "FILE: - Number of write operations": "0", "Bytes Read": "1490"}' - status: completed - timing: - completed: 2016-02-04 14:57:48 +0000 UTC - enqueued: 2016-02-04 14:55:14 +0000 UTC - started: 2016-02-04 14:55:27 +0000 UTC +To access the Resource Manager web consoles, find the `PUBLIC-ADDRESS` of the +resourcemanager application and expose it: + juju status resourcemanager + juju expose resourcemanager -## Deploying in Network-Restricted Environments +The YARN and Job History web interfaces will be available at the following URLs: -Charms can be deployed in environments with limited network access. To deploy -in this environment, you will need a local mirror to serve required packages. + http://RESOURCEMANAGER_PUBLIC_IP:8088 + http://RESOURCEMANAGER_PUBLIC_IP:19888 + + +# Benchmarking + +This charm provides several benchmarks to gauge the performance of the +cluster. Each benchmark is an action that can be run with `juju run-action`: + $ juju actions resourcemanager + ACTION DESCRIPTION + mrbench Mapreduce benchmark for small jobs + nnbench Load test the NameNode hardware and configuration + smoke-test Run an Apache Bigtop smoke test. + teragen Generate data with teragen + terasort Runs teragen to generate sample data, and then runs terasort to sort that data + testdfsio DFS IO Testing -### Mirroring Packages + $ juju run-action resourcemanager/0 nnbench + Action queued with id: 55887b40-116c-4020-8b35-1e28a54cc622 -You can setup a local mirror for apt packages using squid-deb-proxy. -For instructions on configuring juju to use this, see the -[Juju Proxy Documentation](https://juju.ubuntu.com/docs/howto-proxies.html). + $ juju show-action-output 55887b40-116c-4020-8b35-1e28a54cc622 + results: + meta: + composite: + direction: asc + units: secs + value: "128" + start: 2016-02-04T14:55:39Z + stop: 2016-02-04T14:57:47Z + results: + raw: '{"BAD_ID": "0", "FILE: Number of read operations": "0", "Reduce input groups": + "8", "Reduce input records": "95", "Map output bytes": "1823", "Map input records": + "12", "Combine input records": "0", "HDFS: Number of bytes read": "18635", "FILE: + Number of bytes written": "32999982", "HDFS: Number of write operations": "330", + "Combine output records": "0", "Total committed heap usage (bytes)": "3144749056", + "Bytes Written": "164", "WRONG_LENGTH": "0", "Failed Shuffles": "0", "FILE: + Number of bytes read": "27879457", "WRONG_MAP": "0", "Spilled Records": "190", + "Merged Map outputs": "72", "HDFS: Number of large read operations": "0", "Reduce + shuffle bytes": "2445", "FILE: Number of large read operations": "0", "Map output + materialized bytes": "2445", "IO_ERROR": "0", "CONNECTION": "0", "HDFS: Number + of read operations": "567", "Map output records": "95", "Reduce output records": + "8", "WRONG_REDUCE": "0", "HDFS: Number of bytes written": "27412", "GC time + elapsed (ms)": "603", "Input split bytes": "1610", "Shuffled Maps ": "72", "FILE: + Number of write operations": "0", "Bytes Read": "1490"}' + status: completed + timing: + completed: 2016-02-04 14:57:48 +0000 UTC + enqueued: 2016-02-04 14:55:14 +0000 UTC + started: 2016-02-04 14:55:27 +0000 UTC -## Contact Information +# Contact Information - <bigdata@lists.ubuntu.com> -## Hadoop +# Resources - [Apache Bigtop](http://bigtop.apache.org/) home page - [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html) - [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html) -- [Apache Bigtop charms](https://jujucharms.com/q/apache/bigtop) +- [Juju Bigtop charms](https://jujucharms.com/q/apache/bigtop) +- [Juju mailing list](https://lists.ubuntu.com/mailman/listinfo/juju) +- [Juju community](https://jujucharms.com/community) diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml index da4fc08e..77a644bf 100644 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions.yaml @@ -1,6 +1,5 @@ smoke-test: - description: > - Verify that YARN is working as expected by running a small (1MB) terasort. + description: Run an Apache Bigtop smoke test. mrbench: description: Mapreduce benchmark for small jobs params: diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test index 9ef33a9f..3280e791 100755 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/actions/smoke-test @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env python3 # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with @@ -15,66 +15,34 @@ # See the License for the specific language governing permissions and # limitations under the License. -set -ex +import sys +sys.path.append('lib') -if ! charms.reactive is_state 'apache-bigtop-resourcemanager.ready'; then - action-fail 'ResourceManager not yet ready' - exit -fi +from charmhelpers.core import hookenv +from charms.layer.apache_bigtop_base import Bigtop +from charms.reactive import is_state -IN_DIR='/tmp/smoke_test_in' -OUT_DIR='/tmp/smoke_test_out' -SIZE=10000 -OPTIONS='' -MAPS=1 -REDUCES=1 -NUMTASKS=1 -COMPRESSION='LocalDefault' +def fail(msg, output=None): + if output: + hookenv.action_set({'output': output}) + hookenv.action_fail(msg) + sys.exit() -OPTIONS="${OPTIONS} -D mapreduce.job.maps=${MAPS}" -OPTIONS="${OPTIONS} -D mapreduce.job.reduces=${REDUCES}" -OPTIONS="${OPTIONS} -D mapreduce.job.jvm.numtasks=${NUMTASKS}" -if [ $COMPRESSION == 'Disable' ] ; then - OPTIONS="${OPTIONS} -D mapreduce.map.output.compress=false" -elif [ $COMPRESSION == 'LocalDefault' ] ; then - OPTIONS="${OPTIONS}" -else - OPTIONS="${OPTIONS} -D mapreduce.map.output.compress=true -D mapred.map.output.compress.codec=org.apache.hadoop.io.compress.${COMPRESSION}Codec" -fi +if not is_state('apache-bigtop-resourcemanager.ready'): + fail('Charm is not yet ready to run the Bigtop smoke test(s)') -# create dir to store results -RUN=`date +%s` -RESULT_DIR=/opt/terasort-results -RESULT_LOG=${RESULT_DIR}/${RUN}.$$.log -mkdir -p ${RESULT_DIR} -chown -R hdfs ${RESULT_DIR} +# Bigtop smoke test components +smoke_components = ['yarn'] -# clean out any previous data (must be run as the hdfs user) -su hdfs << EOF -if hadoop fs -stat ${IN_DIR} &> /dev/null; then - hadoop fs -rm -r -skipTrash ${IN_DIR} || true -fi -if hadoop fs -stat ${OUT_DIR} &> /dev/null; then - hadoop fs -rm -r -skipTrash ${OUT_DIR} || true -fi -EOF +# Env required by test components +smoke_env = { + 'HADOOP_CONF_DIR': '/etc/hadoop/conf', +} -START=`date +%s` -# NB: Escaped vars in the block below (e.g., \${HADOOP_MAPRED_HOME}) come from -# the environment while non-escaped vars (e.g., ${IN_DIR}) are parameterized -# from this outer scope -su hdfs << EOF -. /etc/default/hadoop -echo 'generating data' -hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar teragen ${SIZE} ${IN_DIR} &>/dev/null -echo 'sorting data' -hadoop jar \${HADOOP_MAPRED_HOME}/hadoop-mapreduce-examples-*.jar terasort ${OPTIONS} ${IN_DIR} ${OUT_DIR} &> ${RESULT_LOG} -EOF -STOP=`date +%s` - -if ! grep -q 'Bytes Written=1000000' ${RESULT_LOG}; then - action-fail 'smoke-test failed' - action-set log="$(cat ${RESULT_LOG})" -fi -DURATION=`expr $STOP - $START` +bigtop = Bigtop() +result = bigtop.run_smoke_tests(smoke_components, smoke_env) +if result == 'success': + hookenv.action_set({'outcome': result}) +else: + fail('{} smoke tests failed'.format(smoke_components), result) diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml index ad0b5695..c2e34205 100644 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/layer.yaml @@ -1,4 +1,4 @@ -repo: git@github.com:juju-solutions/layer-hadoop-resourcemanager.git +repo: https://github.com/apache/bigtop/tree/master/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager includes: - 'layer:apache-bigtop-base' - 'interface:dfs' diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml index 82b82cd7..695d5bfa 100644 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/metadata.yaml @@ -1,12 +1,12 @@ name: hadoop-resourcemanager -summary: YARN master (ResourceManager) for Apache Bigtop platform +summary: YARN master (ResourceManager) from Apache Bigtop maintainer: Juju Big Data <bigdata@lists.ubuntu.com> description: > Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data. - This charm manages the YARN master node (ResourceManager). -tags: ["applications", "bigdata", "bigtop", "hadoop", "apache"] + This charm provides the YARN master node (ResourceManager). +tags: [] provides: resourcemanager: interface: mapred diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/reactive/resourcemanager.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/reactive/resourcemanager.py index afca26bd..3f3e9ae7 100644 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/reactive/resourcemanager.py +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/reactive/resourcemanager.py @@ -15,7 +15,9 @@ # limitations under the License. from charms.reactive import is_state, remove_state, set_state, when, when_not -from charms.layer.apache_bigtop_base import Bigtop, get_layer_opts, get_fqdn +from charms.layer.apache_bigtop_base import ( + Bigtop, get_hadoop_version, get_layer_opts, get_fqdn +) from charmhelpers.core import hookenv, host from jujubigdata import utils @@ -61,11 +63,32 @@ def install_resourcemanager(namenode): """ if namenode.namenodes(): hookenv.status_set('maintenance', 'installing resourcemanager') + # Hosts nn_host = namenode.namenodes()[0] rm_host = get_fqdn() + + # Ports + rm_ipc = get_layer_opts().port('resourcemanager') + rm_http = get_layer_opts().port('rm_webapp_http') + jh_ipc = get_layer_opts().port('jobhistory') + jh_http = get_layer_opts().port('jh_webapp_http') + bigtop = Bigtop() - hosts = {'namenode': nn_host, 'resourcemanager': rm_host} - bigtop.render_site_yaml(hosts=hosts, roles='resourcemanager') + bigtop.render_site_yaml( + hosts={ + 'namenode': nn_host, + 'resourcemanager': rm_host, + }, + roles=[ + 'resourcemanager', + ], + overrides={ + 'hadoop::common_yarn::hadoop_rm_port': rm_ipc, + 'hadoop::common_yarn::hadoop_rm_webapp_port': rm_http, + 'hadoop::common_mapred_app::mapreduce_jobhistory_port': jh_ipc, + 'hadoop::common_mapred_app::mapreduce_jobhistory_webapp_port': jh_http, + } + ) bigtop.trigger_puppet() # /etc/hosts entries from the KV are not currently used for bigtop, @@ -104,7 +127,8 @@ def start_resourcemanager(namenode): for port in get_layer_opts().exposed_ports('resourcemanager'): hookenv.open_port(port) set_state('apache-bigtop-resourcemanager.started') - hookenv.status_set('active', 'ready') + hookenv.application_version_set(get_hadoop_version()) + hookenv.status_set('maintenance', 'resourcemanager started') ############################################################################### diff --git a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/tests/01-basic-deployment.py b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/tests/01-basic-deployment.py index 65dbbbb5..3b694548 100755 --- a/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/tests/01-basic-deployment.py +++ b/bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/tests/01-basic-deployment.py @@ -28,7 +28,7 @@ class TestDeploy(unittest.TestCase): """ def test_deploy(self): - self.d = amulet.Deployment(series='trusty') + self.d = amulet.Deployment(series='xenial') self.d.add('resourcemanager', 'hadoop-resourcemanager') self.d.setup(timeout=900) self.d.sentry.wait(timeout=1800) |