diff options
author | ganeshraju <ganeshraju@gmail.com> | 2016-06-21 16:10:43 -0500 |
---|---|---|
committer | ganeshraju <ganeshraju@gmail.com> | 2016-06-21 16:10:43 -0500 |
commit | 9d62f51ec3684c75d0fa4b8d2da9301e0a18f661 (patch) | |
tree | 2a255c6f11a6358f76e701b1357ee8a758d18998 /bigtop-deploy | |
parent | cff413decf877c03c670de67e002bf1d6d15a5fc (diff) |
sync with bigtop master'
Diffstat (limited to 'bigtop-deploy')
44 files changed, 2947 insertions, 204 deletions
diff --git a/bigtop-deploy/juju/hadoop-processing/.gitignore b/bigtop-deploy/juju/hadoop-processing/.gitignore new file mode 100644 index 00000000..a295864e --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/.gitignore @@ -0,0 +1,2 @@ +*.pyc +__pycache__ diff --git a/bigtop-deploy/juju/hadoop-processing/README.md b/bigtop-deploy/juju/hadoop-processing/README.md new file mode 100644 index 00000000..69425771 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/README.md @@ -0,0 +1,194 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +## Overview + +The Apache Hadoop software library is a framework that allows for the +distributed processing of large data sets across clusters of computers +using a simple programming model. + +It is designed to scale up from single servers to thousands of machines, +each offering local computation and storage. Rather than rely on hardware +to deliver high-avaiability, the library itself is designed to detect +and handle failures at the application layer, so delivering a +highly-availabile service on top of a cluster of computers, each of +which may be prone to failures. + +This bundle provides a complete deployment of the core components of the +[Apache Bigtop](http://bigtop.apache.org/) +platform to perform distributed data analytics at scale. These components +include: + + * NameNode (HDFS) + * ResourceManager (YARN) + * Slaves (DataNode and NodeManager) + * Client (Bigtop hadoop client) + * Plugin (subordinate cluster facilitator) + +Deploying this bundle gives you a fully configured and connected Apache Bigtop +cluster on any supported cloud, which can be easily scaled to meet workload +demands. + + +## Deploying this bundle + +In this deployment, the aforementioned components are deployed on separate +units. To deploy this bundle, simply use: + + juju deploy hadoop-processing + +This will deploy this bundle and all the charms from the [charm store][]. + +> Note: With Juju versions < 2.0, you will need to use [juju-deployer][] to +deploy the bundle. + +You can also build all of the charms from their source layers in the +[Bigtop repository][]. See the [charm package README][] for instructions +to build and deploy the charms. + +The default bundle deploys three slave nodes and one node of each of +the other services. To scale the cluster, use: + + juju add-unit slave -n 2 + +This will add two additional slave nodes, for a total of five. + +[charm store]: https://jujucharms.com/ +[Bigtop repository]: https://github.com/apache/bigtop +[charm package README]: ../../../bigtop-packages/src/charm/README.md +[juju-deployer]: https://pypi.python.org/pypi/juju-deployer/ + + +## Status and Smoke Test + +The services provide extended status reporting to indicate when they are ready: + + juju status --format=tabular + +This is particularly useful when combined with `watch` to track the on-going +progress of the deployment: + + watch -n 0.5 juju status --format=tabular + +The charms for each master component (namenode, resourcemanager) +also each provide a `smoke-test` action that can be used to verify that each +component is functioning as expected. You can run them all and then watch the +action status list: + + juju action do namenode/0 smoke-test + juju action do resourcemanager/0 smoke-test + watch -n 0.5 juju action status + +Eventually, all of the actions should settle to `status: completed`. If +any go instead to `status: failed` then it means that component is not working +as expected. You can get more information about that component's smoke test: + + juju action fetch <action-id> + + +## Monitoring + +This bundle includes Ganglia for system-level monitoring of the namenode, +resourcemanager, and slave units. Metrics are sent to a central +ganglia unit for easy viewing in a browser. To view the ganglia web interface, +first expose the service: + + juju expose ganglia + +Now find the ganglia public IP address: + + juju status ganglia + +The ganglia web interface will be available at: + + http://GANGLIA_PUBLIC_IP/ganglia + + +## Benchmarking + +This charm provides several benchmarks to gauge the performance of your +environment. + +The easiest way to run the benchmarks on this service is to relate it to the +[Benchmark GUI][]. You will likely also want to relate it to the +[Benchmark Collector][] to have machine-level information collected during the +benchmark, for a more complete picture of how the machine performed. + +[Benchmark GUI]: https://jujucharms.com/benchmark-gui/ +[Benchmark Collector]: https://jujucharms.com/benchmark-collector/ + +However, each benchmark is also an action that can be called manually: + + $ juju action do resourcemanager/0 nnbench + Action queued with id: 55887b40-116c-4020-8b35-1e28a54cc622 + $ juju action fetch --wait 0 55887b40-116c-4020-8b35-1e28a54cc622 + + results: + meta: + composite: + direction: asc + units: secs + value: "128" + start: 2016-02-04T14:55:39Z + stop: 2016-02-04T14:57:47Z + results: + raw: '{"BAD_ID": "0", "FILE: Number of read operations": "0", "Reduce input groups": + "8", "Reduce input records": "95", "Map output bytes": "1823", "Map input records": + "12", "Combine input records": "0", "HDFS: Number of bytes read": "18635", "FILE: + Number of bytes written": "32999982", "HDFS: Number of write operations": "330", + "Combine output records": "0", "Total committed heap usage (bytes)": "3144749056", + "Bytes Written": "164", "WRONG_LENGTH": "0", "Failed Shuffles": "0", "FILE: + Number of bytes read": "27879457", "WRONG_MAP": "0", "Spilled Records": "190", + "Merged Map outputs": "72", "HDFS: Number of large read operations": "0", "Reduce + shuffle bytes": "2445", "FILE: Number of large read operations": "0", "Map output + materialized bytes": "2445", "IO_ERROR": "0", "CONNECTION": "0", "HDFS: Number + of read operations": "567", "Map output records": "95", "Reduce output records": + "8", "WRONG_REDUCE": "0", "HDFS: Number of bytes written": "27412", "GC time + elapsed (ms)": "603", "Input split bytes": "1610", "Shuffled Maps ": "72", "FILE: + Number of write operations": "0", "Bytes Read": "1490"}' + status: completed + timing: + completed: 2016-02-04 14:57:48 +0000 UTC + enqueued: 2016-02-04 14:55:14 +0000 UTC + started: 2016-02-04 14:55:27 +0000 UTC + + +## Deploying in Network-Restricted Environments + +Charms can be deployed in environments with limited network access. To deploy +in this environment, you will need a local mirror to serve required packages. + + +### Mirroring Packages + +You can setup a local mirror for apt packages using squid-deb-proxy. +For instructions on configuring juju to use this, see the +[Juju Proxy Documentation](https://juju.ubuntu.com/docs/howto-proxies.html). + + +## Contact Information + +- <bigdata@lists.ubuntu.com> + + +## Resources + +- [Apache Bigtop](http://bigtop.apache.org/) home page +- [Apache Bigtop issue tracking](http://bigtop.apache.org/issue-tracking.html) +- [Apache Bigtop mailing lists](http://bigtop.apache.org/mail-lists.html) +- [Juju Bigtop charms](https://jujucharms.com/q/apache/bigtop) +- [Juju mailing list](https://lists.ubuntu.com/mailman/listinfo/juju) +- [Juju community](https://jujucharms.com/community) diff --git a/bigtop-deploy/juju/hadoop-processing/bundle-dev.yaml b/bigtop-deploy/juju/hadoop-processing/bundle-dev.yaml new file mode 100644 index 00000000..abc18513 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/bundle-dev.yaml @@ -0,0 +1,68 @@ +services: + openjdk: + charm: cs:trusty/openjdk + annotations: + gui-x: "500" + gui-y: "400" + options: + java-type: "jdk" + java-major: "8" + namenode: + charm: cs:~bigdata-dev/trusty/hadoop-namenode + num_units: 1 + annotations: + gui-x: "500" + gui-y: "800" + constraints: mem=7G + resourcemanager: + charm: cs:~bigdata-dev/trusty/hadoop-resourcemanager + num_units: 1 + annotations: + gui-x: "500" + gui-y: "0" + constraints: mem=7G + slave: + charm: cs:~bigdata-dev/trusty/hadoop-slave + num_units: 3 + annotations: + gui-x: "0" + gui-y: "400" + constraints: mem=7G + plugin: + charm: cs:~bigdata-dev/trusty/hadoop-plugin + annotations: + gui-x: "1000" + gui-y: "400" + client: + charm: cs:trusty/hadoop-client + num_units: 1 + annotations: + gui-x: "1250" + gui-y: "400" + ganglia-node: + charm: cs:trusty/ganglia-node + annotations: + gui-x: "250" + gui-y: "400" + ganglia: + charm: cs:trusty/ganglia + num_units: 1 + annotations: + gui-x: "750" + gui-y: "400" +series: trusty +relations: + - [openjdk, namenode] + - [openjdk, resourcemanager] + - [openjdk, slave] + - [openjdk, client] + - [resourcemanager, namenode] + - [namenode, slave] + - [resourcemanager, slave] + - [plugin, namenode] + - [plugin, resourcemanager] + - [client, plugin] + - ["ganglia:node", ganglia-node] + - [ganglia-node, namenode] + - [ganglia-node, resourcemanager] + - [ganglia-node, slave] diff --git a/bigtop-deploy/juju/hadoop-processing/bundle-local.yaml b/bigtop-deploy/juju/hadoop-processing/bundle-local.yaml new file mode 100644 index 00000000..3947f829 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/bundle-local.yaml @@ -0,0 +1,68 @@ +services: + openjdk: + charm: cs:trusty/openjdk + annotations: + gui-x: "500" + gui-y: "400" + options: + java-type: "jdk" + java-major: "8" + namenode: + charm: local:trusty/hadoop-namenode + num_units: 1 + annotations: + gui-x: "500" + gui-y: "800" + constraints: mem=7G + resourcemanager: + charm: local:trusty/hadoop-resourcemanager + num_units: 1 + annotations: + gui-x: "500" + gui-y: "0" + constraints: mem=7G + slave: + charm: local:trusty/hadoop-slave + num_units: 3 + annotations: + gui-x: "0" + gui-y: "400" + constraints: mem=7G + plugin: + charm: local:trusty/hadoop-plugin + annotations: + gui-x: "1000" + gui-y: "400" + client: + charm: cs:trusty/hadoop-client + num_units: 1 + annotations: + gui-x: "1250" + gui-y: "400" + ganglia-node: + charm: cs:trusty/ganglia-node + annotations: + gui-x: "250" + gui-y: "400" + ganglia: + charm: cs:trusty/ganglia + num_units: 1 + annotations: + gui-x: "750" + gui-y: "400" +series: trusty +relations: + - [openjdk, namenode] + - [openjdk, resourcemanager] + - [openjdk, slave] + - [openjdk, client] + - [resourcemanager, namenode] + - [namenode, slave] + - [resourcemanager, slave] + - [plugin, namenode] + - [plugin, resourcemanager] + - [client, plugin] + - ["ganglia:node", ganglia-node] + - [ganglia-node, namenode] + - [ganglia-node, resourcemanager] + - [ganglia-node, slave] diff --git a/bigtop-deploy/juju/hadoop-processing/bundle.yaml b/bigtop-deploy/juju/hadoop-processing/bundle.yaml new file mode 100644 index 00000000..dcc5bd99 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/bundle.yaml @@ -0,0 +1,68 @@ +services: + openjdk: + charm: cs:trusty/openjdk-1 + annotations: + gui-x: "500" + gui-y: "400" + options: + java-type: "jdk" + java-major: "8" + namenode: + charm: cs:trusty/hadoop-namenode-3 + num_units: 1 + annotations: + gui-x: "500" + gui-y: "800" + constraints: mem=7G + resourcemanager: + charm: cs:trusty/hadoop-resourcemanager-3 + num_units: 1 + annotations: + gui-x: "500" + gui-y: "0" + constraints: mem=7G + slave: + charm: cs:trusty/hadoop-slave-4 + num_units: 3 + annotations: + gui-x: "0" + gui-y: "400" + constraints: mem=7G + plugin: + charm: cs:trusty/hadoop-plugin-3 + annotations: + gui-x: "1000" + gui-y: "400" + client: + charm: cs:trusty/hadoop-client-4 + num_units: 1 + annotations: + gui-x: "1250" + gui-y: "400" + ganglia-node: + charm: cs:trusty/ganglia-node-2 + annotations: + gui-x: "250" + gui-y: "400" + ganglia: + charm: cs:trusty/ganglia-2 + num_units: 1 + annotations: + gui-x: "750" + gui-y: "400" +series: trusty +relations: + - [openjdk, namenode] + - [openjdk, resourcemanager] + - [openjdk, slave] + - [openjdk, client] + - [resourcemanager, namenode] + - [namenode, slave] + - [resourcemanager, slave] + - [plugin, namenode] + - [plugin, resourcemanager] + - [client, plugin] + - ["ganglia:node", ganglia-node] + - [ganglia-node, namenode] + - [ganglia-node, resourcemanager] + - [ganglia-node, slave] diff --git a/bigtop-deploy/juju/hadoop-processing/copyright b/bigtop-deploy/juju/hadoop-processing/copyright new file mode 100644 index 00000000..e900b97c --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/copyright @@ -0,0 +1,16 @@ +Format: http://dep.debian.net/deps/dep5/ + +Files: * +Copyright: Copyright 2015, Canonical Ltd., All Rights Reserved. +License: Apache License 2.0 + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + . + http://www.apache.org/licenses/LICENSE-2.0 + . + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/bigtop-deploy/juju/hadoop-processing/tests/01-bundle.py b/bigtop-deploy/juju/hadoop-processing/tests/01-bundle.py new file mode 100755 index 00000000..176ff748 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/tests/01-bundle.py @@ -0,0 +1,96 @@ +#!/usr/bin/env python3 + +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import unittest + +import yaml +import amulet + + +class TestBundle(unittest.TestCase): + bundle_file = os.path.join(os.path.dirname(__file__), '..', 'bundle.yaml') + + @classmethod + def setUpClass(cls): + # classmethod inheritance doesn't work quite right with + # setUpClass / tearDownClass, so subclasses have to manually call this + cls.d = amulet.Deployment(series='trusty') + with open(cls.bundle_file) as f: + bun = f.read() + bundle = yaml.safe_load(bun) + cls.d.load(bundle) + cls.d.setup(timeout=3600) + cls.d.sentry.wait_for_messages({'client': 'Ready'}, timeout=3600) + cls.hdfs = cls.d.sentry['namenode'][0] + cls.yarn = cls.d.sentry['resourcemanager'][0] + cls.slave = cls.d.sentry['slave'][0] + cls.client = cls.d.sentry['client'][0] + + def test_components(self): + """ + Confirm that all of the required components are up and running. + """ + hdfs, retcode = self.hdfs.run("pgrep -a java") + yarn, retcode = self.yarn.run("pgrep -a java") + slave, retcode = self.slave.run("pgrep -a java") + client, retcode = self.client.run("pgrep -a java") + + assert 'NameNode' in hdfs, "NameNode not started" + assert 'NameNode' not in yarn, "NameNode should not be running on resourcemanager" + assert 'NameNode' not in slave, "NameNode should not be running on slave" + + assert 'ResourceManager' in yarn, "ResourceManager not started" + assert 'ResourceManager' not in hdfs, "ResourceManager should not be running on namenode" + assert 'ResourceManager' not in slave, "ResourceManager should not be running on slave" + + assert 'JobHistoryServer' in yarn, "JobHistoryServer not started" + assert 'JobHistoryServer' not in hdfs, "JobHistoryServer should not be running on namenode" + assert 'JobHistoryServer' not in slave, "JobHistoryServer should not be running on slave" + + assert 'NodeManager' in slave, "NodeManager not started" + assert 'NodeManager' not in yarn, "NodeManager should not be running on resourcemanager" + assert 'NodeManager' not in hdfs, "NodeManager should not be running on namenode" + + assert 'DataNode' in slave, "DataServer not started" + assert 'DataNode' not in yarn, "DataNode should not be running on resourcemanager" + assert 'DataNode' not in hdfs, "DataNode should not be running on namenode" + + def test_hdfs(self): + """Smoke test validates mkdir, ls, chmod, and rm on the hdfs cluster.""" + unit_name = self.hdfs.info['unit_name'] + uuid = self.d.action_do(unit_name, 'smoke-test') + result = self.d.action_fetch(uuid) + # hdfs smoke-test sets outcome=success on success + if (result['outcome'] != "success"): + error = "HDFS smoke-test failed" + amulet.raise_status(amulet.FAIL, msg=error) + + def test_yarn(self): + """Smoke test validates teragen/terasort.""" + unit_name = self.yarn.info['unit_name'] + uuid = self.d.action_do(unit_name, 'smoke-test') + result = self.d.action_fetch(uuid) + # yarn smoke-test only returns results on failure; if result is not + # empty, the test has failed and has a 'log' key + if result: + error = "YARN smoke-test failed: %s" % result['log'] + amulet.raise_status(amulet.FAIL, msg=error) + + +if __name__ == '__main__': + unittest.main() diff --git a/bigtop-deploy/juju/hadoop-processing/tests/tests.yaml b/bigtop-deploy/juju/hadoop-processing/tests/tests.yaml new file mode 100644 index 00000000..8a4cf6f1 --- /dev/null +++ b/bigtop-deploy/juju/hadoop-processing/tests/tests.yaml @@ -0,0 +1,4 @@ +reset: false +packages: + - amulet + - python3-yaml diff --git a/bigtop-deploy/puppet/README.md b/bigtop-deploy/puppet/README.md index 6364b9e8..692c3952 100644 --- a/bigtop-deploy/puppet/README.md +++ b/bigtop-deploy/puppet/README.md @@ -137,7 +137,7 @@ hadoop_cluster_node::cluster_components: - yarn - zookeeper bigtop::jdk_package_name: "openjdk-7-jre-headless" -bigtop::bigtop_repo_uri: "http://bigtop.s3.amazonaws.com/releases/1.0.0/ubuntu/trusty/x86_64" +bigtop::bigtop_repo_uri: "http://bigtop-repos.s3.amazonaws.com/releases/1.1.0/ubuntu/trusty/x86_64" ``` And finally execute diff --git a/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml b/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml index de98502c..5c2c5f12 100644 --- a/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml +++ b/bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml @@ -16,8 +16,8 @@ # "$components" list. If $components isn't set then everything in the stack will # be installed as usual. Otherwise only a specified list will be set # Possible elements: -# hadoop,yarn,hbase,tachyon,flume,solrcloud,spark,oozie,hcat,sqoop,sqoop2,httpfs, -# hue,mahout,giraph,crunch,pig,hive,zookeeper, ycsb +# hadoop,yarn,hbase,alluxio,flink,flume,solrcloud,spark,oozie,hcat,sqoop,sqoop2,httpfs, +# hue,mahout,giraph,crunch,pig,hive,zookeeper,ycsb,qfs # Example (to deploy only HDFS and YARN server and gateway parts) # This can be a comma-separated list or an array. #hadoop_cluster_node::cluster_components: @@ -54,6 +54,13 @@ #bigtop::bigtop_repo_uri: "http://mirror.example.com/path/to/mirror/" +# Use a pre-installed java environment. The default value of 'false' will cause +# the configured 'bigtop::jdk_package_name' package to be installed. Setting +# this to 'true' will ignore the configured 'bigtop::jdk_package_name' but +# requires a compatible java environment be avaialble prior to Bigtop +# installation. +#bigtop::jdk_preinstalled: false + # Test-only variable controls if user hdfs' sshkeys should be installed to allow # for passwordless login across the cluster. Required by some integration tests #hadoop::common_hdfs::testonly_hdfs_sshkeys: "no" @@ -98,6 +105,10 @@ hadoop::common_yarn::hadoop_rm_port: "8032" hadoop::common_mapred_app::jobtracker_host: "%{hiera('bigtop::hadoop_head_node')}" hadoop::common_mapred_app::mapreduce_jobhistory_host: "%{hiera('bigtop::hadoop_head_node')}" +# actually default but needed for hadoop::common_yarn::yarn_log_server_url here +bigtop::hadoop_history_server_port: "19888" +bigtop::hadoop_history_server_url: "http://%{hiera('hadoop::common_mapred_app::mapreduce_jobhistory_host')}:%{hiera('bigtop::hadoop_history_server_port')}" +hadoop::common_yarn::yarn_log_server_url: "%{hiera('bigtop::hadoop_history_server_url')}/jobhistory/logs" # actually default but needed for hue::server::webhdfs_url here hadoop::httpfs::hadoop_httpfs_port: "14000" @@ -125,7 +136,14 @@ hcatalog::webhcat::server::kerberos_realm: "%{hiera('kerberos::site::realm')}" spark::common::master_host: "%{hiera('bigtop::hadoop_head_node')}" -tachyon::common::master_host: "%{hiera('bigtop::hadoop_head_node')}" +alluxio::common::master_host: "%{hiera('bigtop::hadoop_head_node')}" + +# qfs +qfs::common::metaserver_host: "%{hiera('bigtop::hadoop_head_node')}" +qfs::common::metaserver_port: "30000" +qfs::common::chunkserver_port: "30000" +qfs::common::metaserver_client_port: "20000" +qfs::common::chunkserver_client_port: "22000" hadoop_zookeeper::server::myid: "0" hadoop_zookeeper::server::ensemble: @@ -135,7 +153,6 @@ hadoop_zookeeper::server::kerberos_realm: "%{hiera('kerberos::site::realm')}" # those are only here because they were present as extlookup keys previously bigtop::hadoop_rm_http_port: "8088" bigtop::hadoop_rm_proxy_port: "8088" -bigtop::hadoop_history_server_port: "19888" bigtop::sqoop2_server_port: "12000" bigtop::hbase_thrift_port: "9090" bigtop::hadoop_oozie_port: "11000" @@ -143,8 +160,8 @@ bigtop::hadoop_oozie_port: "11000" hue::server::rm_host: "%{hiera('hadoop::common_yarn::hadoop_rm_host')}" hue::server::rm_port: "%{hiera('hadoop::common_yarn::hadoop_rm_port')}" hue::server::rm_url: "http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_rm_http_port')}" -hue::server::rm_proxy_url: "http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_rm_proxy_port')}" -hue::server::history_server_url: "http://%{hiera('bigtop::hadoop_head_node')}:%{hiera('bigtop::hadoop_history_server_port')}" +hue::server::rm_proxy_url: "http://%{hiera('hadoop::common_yarn::hadoop_ps_host')}:%{hiera('hadoop::common_yarn::hadoop_ps_port')}" +hue::server::history_server_url: "%{hiera('bigtop::hadoop_history_server_url')}" # those use fqdn instead of hadoop_head_node because it's only ever activated # on the gatewaynode hue::server::webhdfs_url: "http://%{fqdn}:%{hiera('hadoop::httpfs::hadoop_httpfs_port')}/webhdfs/v1" @@ -174,3 +191,10 @@ zeppelin::server::spark_master_url: "yarn-client" zeppelin::server::hiveserver2_url: "jdbc:hive2://%{hiera('hadoop-hive::common::hiveserver2_host')}:%{hiera('hadoop-hive::common::hiveserver2_port')}" zeppelin::server::hiveserver2_user: "%{hiera('bigtop::hiveserver2_user')}" zeppelin::server::hiveserver2_password: "%{hiera('bigtop::hiveserver2_password')}" + +# Flink +flink::common::jobmanager_host: "%{hiera('bigtop::hadoop_head_node')}" +flink::common::jobmanager_port: "6123" + +flink::common::ui_port: "8081" +flink::common::storage_dirs: "%{hiera('hadoop::hadoop_storage_dirs')}" diff --git a/bigtop-deploy/puppet/hieradata/site.yaml b/bigtop-deploy/puppet/hieradata/site.yaml index e44cb500..b9ed6f98 100644 --- a/bigtop-deploy/puppet/hieradata/site.yaml +++ b/bigtop-deploy/puppet/hieradata/site.yaml @@ -12,7 +12,10 @@ hadoop::hadoop_storage_dirs: - /data/4 #hadoop_cluster_node::cluster_components: +# - alluxio +# - apex # - crunch +# - flink # - flume # - giraph # - ignite_hadoop @@ -25,11 +28,11 @@ hadoop::hadoop_storage_dirs: # - mapred-app # - oozie # - pig +# - qfs # - solrcloud # - spark # - sqoop # - sqoop2 -# - tachyon # - tez # - yarn # - zookeeper diff --git a/bigtop-deploy/puppet/manifests/cluster.pp b/bigtop-deploy/puppet/manifests/cluster.pp index a0be5678..9ff424cd 100644 --- a/bigtop-deploy/puppet/manifests/cluster.pp +++ b/bigtop-deploy/puppet/manifests/cluster.pp @@ -14,6 +14,9 @@ # limitations under the License. $roles_map = { + apex => { + client => ["apex-client"], + }, hdfs-non-ha => { master => ["namenode"], worker => ["datanode"], @@ -49,9 +52,13 @@ $roles_map = { master => ["spark-master"], worker => ["spark-worker"], }, - tachyon => { - master => ["tachyon-master"], - worker => ["tachyon-worker"], + alluxio => { + master => ["alluxio-master"], + worker => ["alluxio-worker"], + }, + flink => { + master => ["flink-jobmanager"], + worker => ["flink-taskmanager"], }, flume => { worker => ["flume-agent"], @@ -105,6 +112,11 @@ $roles_map = { zeppelin => { master => ["zeppelin-server"], }, + qfs => { + master => ["qfs-metaserver"], + worker => ["qfs-chunkserver"], + client => ["qfs-client"], + }, } class hadoop_cluster_node ( @@ -150,7 +162,10 @@ class node_with_roles ($roles = hiera("bigtop::roles")) inherits hadoop_cluster_ } $modules = [ + "alluxio", + "apex", "crunch", + "flink", "giraph", "hadoop", "hadoop_hbase", @@ -166,7 +181,7 @@ class node_with_roles ($roles = hiera("bigtop::roles")) inherits hadoop_cluster_ "mahout", "solr", "spark", - "tachyon", + "qfs", "tez", "ycsb", "kerberos", diff --git a/bigtop-deploy/puppet/manifests/site.pp b/bigtop-deploy/puppet/manifests/site.pp index 6e5fd06b..4862b223 100644 --- a/bigtop-deploy/puppet/manifests/site.pp +++ b/bigtop-deploy/puppet/manifests/site.pp @@ -28,6 +28,7 @@ case $operatingsystem { } } +$jdk_preinstalled = hiera("bigtop::jdk_preinstalled", false) $jdk_package_name = hiera("bigtop::jdk_package_name", "jdk") stage {"pre": before => Stage["main"]} @@ -70,6 +71,7 @@ case $operatingsystem { package { $jdk_package_name: ensure => "installed", alias => "jdk", + noop => $jdk_preinstalled, } import "cluster.pp" diff --git a/bigtop-deploy/puppet/modules/alluxio/manifests/init.pp b/bigtop-deploy/puppet/modules/alluxio/manifests/init.pp new file mode 100644 index 00000000..66151cad --- /dev/null +++ b/bigtop-deploy/puppet/modules/alluxio/manifests/init.pp @@ -0,0 +1,79 @@ +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +class alluxio { + + class deploy ($roles) { + if ("alluxio-master" in $roles) { + include alluxio::master + } + + if ("alluxio-worker" in $roles) { + include alluxio::worker + } + } + + class common ($master_host){ + package { "alluxio": + ensure => latest, + } + + # add logging into /var/log/.. + file { + "/etc/alluxio/conf/log4j.properties": + content => template("alluxio/log4j.properties"), + require => [Package["alluxio"]] + } + + # add alluxio-env.sh to point to alluxio master + file { "/etc/alluxio/conf/alluxio-env.sh": + content => template("alluxio/alluxio-env.sh"), + require => [Package["alluxio"]] + } + } + + class master { + include common + + exec { + "alluxio formatting": + command => "/usr/lib/alluxio/bin/alluxio format", + require => [ Package["alluxio"], File["/etc/alluxio/conf/log4j.properties"], File["/etc/alluxio/conf/alluxio-env.sh"] ] + } + + if ( $fqdn == $alluxio::common::master_host ) { + service { "alluxio-master": + ensure => running, + require => [ Package["alluxio"], Exec["alluxio formatting"] ], + hasrestart => true, + hasstatus => true, + } + } + + } + + class worker { + include common + + if ( $fqdn == $alluxio::common::master_host ) { + notice("alluxio ---> master host") + # We want master to run first in all cases + Service["alluxio-master"] ~> Service["alluxio-worker"] + } + + service { "alluxio-worker": + ensure => running, + require => [ Package["alluxio"], File["/etc/alluxio/conf/log4j.properties"], File["/etc/alluxio/conf/alluxio-env.sh"] ], + hasrestart => true, + hasstatus => true, + } + } +} diff --git a/bigtop-deploy/puppet/modules/alluxio/templates/alluxio-env.sh b/bigtop-deploy/puppet/modules/alluxio/templates/alluxio-env.sh new file mode 100755 index 00000000..27c7fb2b --- /dev/null +++ b/bigtop-deploy/puppet/modules/alluxio/templates/alluxio-env.sh @@ -0,0 +1,78 @@ +#!/usr/bin/env bash +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# This file contains environment variables required to run Alluxio. Copy it as alluxio-env.sh and +# edit that to configure Alluxio for your site. At a minimum, +# the following variables should be set: +# +# - JAVA_HOME, to point to your JAVA installation +# - ALLUXIO_MASTER_ADDRESS, to bind the master to a different IP address or hostname +# - ALLUXIO_UNDERFS_ADDRESS, to set the under filesystem address. +# - ALLUXIO_WORKER_MEMORY_SIZE, to set how much memory to use (e.g. 1000mb, 2gb) per worker +# - ALLUXIO_RAM_FOLDER, to set where worker stores in memory data +# - ALLUXIO_UNDERFS_HDFS_IMPL, to set which HDFS implementation to use (e.g. com.mapr.fs.MapRFileSystem, +# org.apache.hadoop.hdfs.DistributedFileSystem) + +# The following gives an example: + +if [[ `uname -a` == Darwin* ]]; then + # Assuming Mac OS X + export JAVA_HOME=${JAVA_HOME:-$(/usr/libexec/java_home)} + export ALLUXIO_RAM_FOLDER=/Volumes/ramdisk + export ALLUXIO_JAVA_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc=" +else + # Assuming Linux + if [ -z "$JAVA_HOME" ]; then + export JAVA_HOME=/usr/lib/jvm/java-7-oracle + fi + export ALLUXIO_RAM_FOLDER=/mnt/ramdisk +fi + +export JAVA="$JAVA_HOME/bin/java" + +echo "Starting alluxio w/ java = $JAVA " + +export ALLUXIO_MASTER_ADDRESS=<%= @master_host %> +export ALLUXIO_UNDERFS_ADDRESS=$ALLUXIO_HOME/underfs +#export ALLUXIO_UNDERFS_ADDRESS=hdfs://localhost:9000 +export ALLUXIO_WORKER_MEMORY_SIZE=1GB +export ALLUXIO_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem + +echo "ALLUXIO master => $ALLUXIO_MASTER_ADDRESS " + +CONF_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" + +export ALLUXIO_JAVA_OPTS+=" + -Dlog4j.configuration=file:$CONF_DIR/log4j.properties + -Dalluxio.debug=false + -Dalluxio.underfs.address=$ALLUXIO_UNDERFS_ADDRESS + -Dalluxio.underfs.hdfs.impl=$ALLUXIO_UNDERFS_HDFS_IMPL + -Dalluxio.data.folder=$ALLUXIO_UNDERFS_ADDRESS/tmp/alluxio/data + -Dalluxio.workers.folder=$ALLUXIO_UNDERFS_ADDRESS/tmp/alluxio/workers + -Dalluxio.worker.memory.size=$ALLUXIO_WORKER_MEMORY_SIZE + -Dalluxio.worker.data.folder=$ALLUXIO_RAM_FOLDER/alluxioworker/ + -Dalluxio.master.worker.timeout.ms=60000 + -Dalluxio.master.hostname=$ALLUXIO_MASTER_ADDRESS + -Dalluxio.master.journal.folder=$ALLUXIO_HOME/journal/ + -Dorg.apache.jasper.compiler.disablejsr199=true + -Djava.net.preferIPv4Stack=true +" + +# Master specific parameters. Default to ALLUXIO_JAVA_OPTS. +export ALLUXIO_MASTER_JAVA_OPTS="$ALLUXIO_JAVA_OPTS" + +# Worker specific parameters that will be shared to all workers. Default to ALLUXIO_JAVA_OPTS. +export ALLUXIO_WORKER_JAVA_OPTS="$ALLUXIO_JAVA_OPTS" diff --git a/bigtop-deploy/puppet/modules/tachyon/templates/log4j.properties b/bigtop-deploy/puppet/modules/alluxio/templates/log4j.properties index e3c5f04c..7ef17e68 100644 --- a/bigtop-deploy/puppet/modules/tachyon/templates/log4j.properties +++ b/bigtop-deploy/puppet/modules/alluxio/templates/log4j.properties @@ -13,9 +13,9 @@ # See the License for the specific language governing permissions and # limitations under the License. # May get overridden by System Property -tachyon.logger.type=Console +alluxio.logger.type=Console -log4j.rootLogger=INFO, ${tachyon.logger.type} +log4j.rootLogger=INFO, ${alluxio.logger.type} log4j.appender.Console=org.apache.log4j.ConsoleAppender log4j.appender.Console.Target=System.out @@ -23,8 +23,8 @@ log4j.appender.Console.layout=org.apache.log4j.PatternLayout log4j.appender.Console.layout.ConversionPattern=%d{ISO8601} %-5p %c{1} (%F:%M) - %m%n # Appender for Master -log4j.appender.MASTER_LOGGER=tachyon.Log4jFileAppender -log4j.appender.MASTER_LOGGER.File=/var/log/tachyon/master.log +log4j.appender.MASTER_LOGGER=alluxio.Log4jFileAppender +log4j.appender.MASTER_LOGGER.File=/var/log/alluxio/master.log log4j.appender.MASTER_LOGGER.MaxFileSize=10 log4j.appender.MASTER_LOGGER.MaxBackupIndex=100 @@ -34,8 +34,8 @@ log4j.appender.MASTER_LOGGER.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F #log4j.appender.MASTER_LOGGER.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n # Appender for Workers -log4j.appender.WORKER_LOGGER=tachyon.Log4jFileAppender -log4j.appender.WORKER_LOGGER.File=/var/log/tachyon/worker.log +log4j.appender.WORKER_LOGGER=alluxio.Log4jFileAppender +log4j.appender.WORKER_LOGGER.File=/var/log/alluxio/worker.log log4j.appender.WORKER_LOGGER.MaxFileSize=10 log4j.appender.WORKER_LOGGER.MaxBackupIndex=100 @@ -45,8 +45,8 @@ log4j.appender.WORKER_LOGGER.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F #log4j.appender.WORKER_LOGGER.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n # Appender for User -log4j.appender.USER_LOGGER=tachyon.Log4jFileAppender -log4j.appender.USER_LOGGER.File=/var/log/tachyon/logs/user.log +log4j.appender.USER_LOGGER=alluxio.Log4jFileAppender +log4j.appender.USER_LOGGER.File=/var/log/alluxio/logs/user.log log4j.appender.USER_LOGGER.MaxFileSize=10 log4j.appender.USER_LOGGER.MaxBackupIndex=10 log4j.appender.USER_LOGGER.DeletionPercentage=20 diff --git a/bigtop-deploy/puppet/modules/apex/manifests/init.pp b/bigtop-deploy/puppet/modules/apex/manifests/init.pp new file mode 100644 index 00000000..c4b91c91 --- /dev/null +++ b/bigtop-deploy/puppet/modules/apex/manifests/init.pp @@ -0,0 +1,30 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +class apex { + + class deploy ($roles) { + if ("apex-client" in $roles) { + include apex::client + } + } + + class client { + package { "apex": + ensure => latest, + require => Package["hadoop"], + } + } +} diff --git a/bigtop-deploy/puppet/modules/apex/tests/init.pp b/bigtop-deploy/puppet/modules/apex/tests/init.pp new file mode 100644 index 00000000..0885cbfa --- /dev/null +++ b/bigtop-deploy/puppet/modules/apex/tests/init.pp @@ -0,0 +1,17 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +include apex +apex::client { "test-apex": } diff --git a/bigtop-deploy/puppet/modules/flink/manifests/init.pp b/bigtop-deploy/puppet/modules/flink/manifests/init.pp new file mode 100644 index 00000000..ae79fe45 --- /dev/null +++ b/bigtop-deploy/puppet/modules/flink/manifests/init.pp @@ -0,0 +1,62 @@ +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +class flink { + + class deploy ($roles) { + if ("flink-jobmanager" in $roles) { + include flink::jobmanager + } + + if ("flink-taskmanager" in $roles) { + include flink::taskmanager + } + } + + class common($jobmanager_host, $jobmanager_port, $ui_port, $storage_dirs) { + # make sure flink is installed + package { "flink": + ensure => latest + } + + # set values in flink-conf.yaml + file { "/etc/flink/conf/flink-conf.yaml": + content => template("flink/flink-conf.yaml"), + require => Package["flink"] + } + } + + class jobmanager { + include common + + service { "flink-jobmanager": + ensure => running, + require => Package["flink"], + subscribe => File["/etc/flink/conf/flink-conf.yaml"], + hasrestart => true, + hasstatus => true + + } + + } + + class taskmanager { + include common + + service { "flink-taskmanager": + ensure => running, + require => Package["flink"], + subscribe => File["/etc/flink/conf/flink-conf.yaml"], + hasrestart => true, + hasstatus => true, + } + } +} diff --git a/bigtop-deploy/puppet/modules/flink/templates/flink-conf.yaml b/bigtop-deploy/puppet/modules/flink/templates/flink-conf.yaml new file mode 100644 index 00000000..14e8853b --- /dev/null +++ b/bigtop-deploy/puppet/modules/flink/templates/flink-conf.yaml @@ -0,0 +1,31 @@ +################################################################################ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +################################################################################ + + +# Configuration values managed by puppet: +jobmanager.rpc.address: <%= @jobmanager_host %> +jobmanager.rpc.port: <%= @jobmanager_port %> +jobmanager.web.port: <%= @ui_port %> + +<% if defined?(storage_dirs) %> +taskmanager.tmp.dirs: <%= @storage_dirs.join(":") %> +<% end %> + + +# For performance reasons its highly recommended to allocate as much memory to the +# Flink TaskManager as possible by setting 'taskmanager.heap.mb'. diff --git a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp index 64b3ee69..90c9505b 100644 --- a/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp +++ b/bigtop-deploy/puppet/modules/hadoop/manifests/init.pp @@ -61,11 +61,9 @@ class hadoop ($hadoop_security_authentication = "simple", include hadoop::historyserver include hadoop::proxyserver - Class['Hadoop::Init_hdfs'] -> Class['Hadoop::Resourcemanager'] if ("nodemanager" in $roles) { Class['Hadoop::Resourcemanager'] -> Class['Hadoop::Nodemanager'] } - Class['Hadoop::Init_hdfs'] -> Class['Hadoop::Historyserver'] } if ($hadoop::common_hdfs::ha == "disabled" and "secondarynamenode" in $roles) { @@ -140,6 +138,7 @@ class hadoop ($hadoop_security_authentication = "simple", $hadoop_rm_webapp_port = "8088", $hadoop_rt_port = "8025", $hadoop_sc_port = "8030", + $yarn_log_server_url = undef, $yarn_nodemanager_resource_memory_mb = undef, $yarn_scheduler_maximum_allocation_mb = undef, $yarn_scheduler_minimum_allocation_mb = undef, @@ -827,6 +826,7 @@ class hadoop ($hadoop_security_authentication = "simple", class nodemanager { + include common_mapred_app include common_yarn package { "hadoop-yarn-nodemanager": @@ -868,6 +868,7 @@ class hadoop ($hadoop_security_authentication = "simple", class client { include common_mapred_app + include common_yarn $hadoop_client_packages = $operatingsystem ? { /(OracleLinux|CentOS|RedHat|Fedora)/ => [ "hadoop-doc", "hadoop-hdfs-fuse", "hadoop-client", "hadoop-libhdfs", "hadoop-debuginfo" ], diff --git a/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml b/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml index 6f6c4648..054ae653 100644 --- a/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml +++ b/bigtop-deploy/puppet/modules/hadoop/templates/yarn-site.xml @@ -145,7 +145,12 @@ <name>yarn.log-aggregation-enable</name> <value>true</value> </property> - +<% if @yarn_log_server_url %> + <property> + <name>yarn.log.server.url</name> + <value><%= @yarn_log_server_url %></value> + </property> +<% end %> <property> <name>yarn.dispatcher.exit-on-error</name> <value>true</value> diff --git a/bigtop-deploy/puppet/modules/hadoop_hbase/templates/hbase-env.sh b/bigtop-deploy/puppet/modules/hadoop_hbase/templates/hbase-env.sh index 2f965813..a2dc9aff 100644 --- a/bigtop-deploy/puppet/modules/hadoop_hbase/templates/hbase-env.sh +++ b/bigtop-deploy/puppet/modules/hadoop_hbase/templates/hbase-env.sh @@ -1,7 +1,5 @@ # #/** -# * Copyright 2007 The Apache Software Foundation -# * # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information @@ -21,15 +19,24 @@ # Set environment variables here. -# The java implementation to use. Java 1.6 required. +# This script sets variables multiple times over the course of starting an hbase process, +# so try to keep things idempotent unless you want to take an even deeper look +# into the startup scripts (bin/hbase, etc.) + +# The java implementation to use. Java 1.7+ required. # export JAVA_HOME=/usr/java/jdk1.6.0/ # Extra Java CLASSPATH elements. Optional. export HBASE_CLASSPATH=/etc/hadoop/conf -# The maximum amount of heap to use, in MB. Default is 1000. +# The maximum amount of heap to use. Default is left to JVM default. +# export HBASE_HEAPSIZE=1G export HBASE_HEAPSIZE=<%= @heap_size %> +# Uncomment below if you intend to use off heap cache. For example, to allocate 8G of +# offheap, set the value to "8G". +# export HBASE_OFFHEAPSIZE=1G + # Extra Java runtime options. # Below are what we set by default. May only work with SUN JVM. # For more on why as well as other possible settings, @@ -42,28 +49,70 @@ export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Djava.security.auth.login.config=/ export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Djava.security.auth.login.config=/etc/hbase/conf/jaas.conf" <% end -%> -# Uncomment below to enable java garbage collection logging. -# export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log" +# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes. + +# This enables basic gc logging to the .out file. +# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" + +# This enables basic gc logging to its own file. +# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . +# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" + +# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. +# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . +# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" + +# Uncomment one of the below three options to enable java garbage collection logging for the client processes. + +# This enables basic gc logging to the .out file. +# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" + +# This enables basic gc logging to its own file. +# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . +# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" + +# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. +# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . +# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" + +# See the package documentation for org.apache.hadoop.hbase.io.hfile for other configurations +# needed setting up off-heap block caching. # Uncomment and adjust to enable JMX exporting # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html -# +# NOTE: HBase provides an alternative JMX implementation to fix the random ports issue, please see JMX +# section in HBase Reference Guide for instructions. + # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104" +# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105" # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers +# Uncomment and adjust to keep all the Region Server pages mapped to be memory resident +#HBASE_REGIONSERVER_MLOCK=true +#HBASE_REGIONSERVER_UID="hbase" + +# File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default. +# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters + # Extra ssh options. Empty by default. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" # Where log files are stored. $HBASE_HOME/logs by default. # export HBASE_LOG_DIR=${HBASE_HOME}/logs +# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers +# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" +# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" +# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" +# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" + # A string representing this instance of hbase. $USER by default. # export HBASE_IDENT_STRING=$USER @@ -80,3 +129,12 @@ export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Djava.security.auth.lo # Tell HBase whether it should manage it's own instance of Zookeeper or not. # export HBASE_MANAGES_ZK=true + +# The default log rolling policy is RFA, where the log file is rolled as per the size defined for the +# RFA appender. Please refer to the log4j.properties file to see more details on this appender. +# In case one needs to do log rolling on a date change, one should set the environment property +# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA". +# For example: +# HBASE_ROOT_LOGGER=INFO,DRFA +# The reason for changing default to RFA is to avoid the boundary case of filling out disk space as +# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context. diff --git a/bigtop-deploy/puppet/modules/hadoop_pig/manifests/init.pp b/bigtop-deploy/puppet/modules/hadoop_pig/manifests/init.pp index 0a09dec3..9582ffd0 100644 --- a/bigtop-deploy/puppet/modules/hadoop_pig/manifests/init.pp +++ b/bigtop-deploy/puppet/modules/hadoop_pig/manifests/init.pp @@ -22,11 +22,13 @@ class hadoop_pig { } class client { + include hadoop::common + package { "pig": ensure => latest, require => Package["hadoop"], - } - + } + file { "/etc/pig/conf/pig.properties": content => template('hadoop_pig/pig.properties'), require => Package["pig"], diff --git a/bigtop-deploy/puppet/modules/hue/manifests/init.pp b/bigtop-deploy/puppet/modules/hue/manifests/init.pp index fa6af886..2b646877 100644 --- a/bigtop-deploy/puppet/modules/hue/manifests/init.pp +++ b/bigtop-deploy/puppet/modules/hue/manifests/init.pp @@ -29,6 +29,13 @@ class hue { class server($sqoop2_url = "http://localhost:12000/sqoop", $solr_url = "http://localhost:8983/solr/", $hbase_thrift_url = "", $webhdfs_url, $rm_host, $rm_port, $oozie_url, $rm_proxy_url, $history_server_url, $hive_host = "", $hive_port = "10000", + $zookeeper_host_port = "localhost:2181", + $force_username_lowercase = "false", + $group_filter_value = "objectclass=groupOfEntries", + $nt_domain = undef, + $use_ldap_username_pattern = false, + $ldap_username_pattern = undef, + $remote_deployement_dir = "/user/hue/oozie/deployments", $rm_logical_name = undef, $rm_api_port = "8088", $app_blacklist = "impala, security", $hue_host = "0.0.0.0", $hue_port = "8888", $hue_timezone = "America/Los_Angeles", $default_fs = "hdfs://localhost:8020", @@ -38,7 +45,7 @@ class hue { $base_dn = undef , $bind_dn = undef, $bind_password = undef, $user_name_attr = undef, $user_filter = undef, $group_member_attr = undef, $group_filter = undef, - $hue_apps = "all" ) { + $hue_apps = "all", $default_hdfs_superuser = "hdfs" ) { $hue_packages = $hue_apps ? { "all" => [ "hue", "hue-server" ], # The hue metapackage requires all apps diff --git a/bigtop-deploy/puppet/modules/hue/templates/hue.ini b/bigtop-deploy/puppet/modules/hue/templates/hue.ini index 8c81b69e..51c03160 100644 --- a/bigtop-deploy/puppet/modules/hue/templates/hue.ini +++ b/bigtop-deploy/puppet/modules/hue/templates/hue.ini @@ -70,7 +70,7 @@ ## default_user=hue # This should be the hadoop cluster admin - ## default_hdfs_superuser=hdfs + default_hdfs_superuser=<%= @default_hdfs_superuser %> # If set to false, runcpserver will not actually start the web server. # Used if Apache is being used as a WSGI container. @@ -198,6 +198,11 @@ # The search base for finding users and groups base_dn="<%= @base_dn %>" +<% if @nt_domain -%> + # The NT domain to connect to (only for use with Active Directory) + nt_domain=<%= @nt_domain %> +<% end -%> + # URL of the LDAP server ldap_url=<%= @ldap_url %> @@ -229,8 +234,12 @@ <% else -%> # Pattern for searching for usernames -- Use <username> for the parameter # For use when using LdapBackend for Hue authentication - # ldap_username_pattern="uid=<username>,ou=People,dc=mycompany,dc=com" +<% if @use_ldap_username_pattern -%> + # for example, ldap_username_pattern=uid=<username>,ou=People,dc=mycompany,dc=com + ldap_username_pattern="<%= @ldap_username_pattern %>" +<% end -%> + search_bind_authentication=false <% end -%> # Execute this script to produce the bind user password. This will be used @@ -248,7 +257,7 @@ ignore_username_case=true # Force usernames to lowercase when creating new users from LDAP. - ## force_username_lowercase=false + force_username_lowercase=<%= @force_username_lowercase %> # Choose which kind of subgrouping to use: nested or suboordinate (deprecated). ## subgroups=suboordinate @@ -282,7 +291,7 @@ # Base filter for searching for groups <% if @group_filter -%> - group_filter="objectclass=groupOfEntries" + group_filter="<%= @group_filter_value %>" <% end -%> # The group name attribute in the LDAP schema @@ -359,7 +368,7 @@ # Kerberos principal name for Hue hue_principal=hue/<%= @fqdn %>@<%= @kerberos_realm %> # Path to kinit - kinit_path=<%= (@operatingsystem == 'ubuntu' || @operatingsystem == 'Debian') ? '/usr/bin' : '/usr/kerberos/bin' %>/kinit + kinit_path=<%= (@operatingsystem == 'ubuntu' || @operatingsystem == 'Debian' || @operatingsystem == 'CentOS' ) ? '/usr/bin' : '/usr/kerberos/bin' %>/kinit <% end -%> # Configuration options for using OAuthBackend (core) login @@ -660,7 +669,7 @@ security_enabled=<%= if (@kerberos_realm != "") ; "true" else "false" end %> # Location on HDFS where the workflows/coordinator are deployed when submitted. - remote_deployement_dir=/user/hue/oozie/deployments + remote_deployement_dir=<%= @remote_deployement_dir %> ########################################################################### @@ -908,7 +917,7 @@ [[[default]]] # Zookeeper ensemble. Comma separated list of Host/Port. # e.g. localhost:2181,localhost:2182,localhost:2183 - host_ports=localhost:2181 + host_ports=<%= @zookeeper_host_port %> # The URL of the REST contrib service (required for znode browsing) rest_url=http://localhost:9998 diff --git a/bigtop-deploy/puppet/modules/mahout/manifests/init.pp b/bigtop-deploy/puppet/modules/mahout/manifests/init.pp index 0c55a9be..cfb08dfb 100644 --- a/bigtop-deploy/puppet/modules/mahout/manifests/init.pp +++ b/bigtop-deploy/puppet/modules/mahout/manifests/init.pp @@ -22,6 +22,8 @@ class mahout { } class client { + include hadoop::common + package { "mahout": ensure => latest, require => Package["hadoop"], diff --git a/bigtop-deploy/puppet/modules/qfs/README.md b/bigtop-deploy/puppet/modules/qfs/README.md new file mode 100644 index 00000000..22254203 --- /dev/null +++ b/bigtop-deploy/puppet/modules/qfs/README.md @@ -0,0 +1,80 @@ +QFS puppet module +================= +This module contains puppet recipes for various qfs components, e.g. metaserver, +chunkserver, and the webui. It is responsible for the following items: + - Installing all required packages, configuration, init scripts, etc + - Wiring everything up so that each service can talk to the other + +hadoop-qfs +========== +Furthermore, this module also installs a compatibility wrapper script called +hadoop-qfs to make the hadoop and qfs integration and interaction easier. + +In order to tell the main hadoop command to use qfs as the underlying +filesystem, extra options must be specified. For example, to issue a `hadoop fs` +command, the full command line would look like this: + + $ JAVA_LIBRARY_PATH=/usr/lib/qfs hadoop fs + -Dfs.qfs.impl=com.quantcast.qfs.hadoop.QuantcastFileSystem \ + -Dfs.default.name=qfs://localhost:20000 \ + -Dfs.qfs.metaServerHost=localhost \ + -Dfs.qfs.metaServerPort=20000 \ + -ls / + +This (a) is cumbersome and (b) exposes low level details, e.g. metaserver port +numbers, to the user who likely doesn't care. In order to avoid a poor user +experience, we provide a wrapper script around the main hadoop command called +`hadoop-qfs` which handles all the boilerplate for you. Using `hadoop-qfs`, the +command above is reduced to the following: + + $ hadoop-qfs fs -ls / + +The `hadoop-qfs` command also supports submitting jobs to hadoop which will use +qfs as the underlying filesystem instead of the default HDFS. The process is the +exact same as the standard `hadoop` command, except with using `hadoop-qfs`. + + $ hadoop-qfs jar hadoop-mapreduce-examples.jar pi 100 100 + +Any output data the job writes will be stored in qfs instead of HDFS. Note that +when submitting jobs through the `hadoop-qfs` command, both the path to the jar +file as well as the main class are required options. + +See the usage for `hadoop-qfs` (`hadoop-qfs --help`) for more information. + +Python +====== +To use the qfs bindings in python, you must set the LD_LIBRARY_PATH to where the +qfs shared objects are stored so that they can be found. + + [root@bigtop1 /]# LD_LIBRARY_PATH=/usr/lib/qfs python + Python 2.6.6 (r266:84292, Jul 23 2015, 15:22:56) + [GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2 + Type "help", "copyright", "credits" or "license" for more information. + >>> import qfs + >>> + +Why Not Update Hadoop's `core-site.xml`? +======================================== +The following is a little discussion on the rationale behind why this script has +been implemented the way it has and some of the other options considered. + +One option is to include these configuration options in the core-site.xml file +in a hadoop configuration directory. If we modify the core-site.xml file that +the hadoop component installs, qfs silently takes precedence over hdfs. This +results in a bad user experience since users should have a clear way to choose +which filesystem they want to interact with. + +Another option is to provide our own core-site.xml specific to qfs in an +alternate location and use the --config option to the hadoop command. This works +but all of the other configuration in place for hadoop will either be missing or +have to be symlinked in order to be carried over since we are overriding *all* +configuration. This is annoying and brittle -- new configuration will have to be +carried over whenever it is added. Having two places where configuration needs +to live is usually a bad idea. + +The final option considered, and the one used, is to provide a wrapper around +the hadoop command, don't touch any hadoop configuration, and set the necessary +override parameters on the command line here. This seemed the cleanest solution +and also made sure that a qfs installation is as non-confrontational as +possible. Unfortunately, the hadoop command is very picky about where the +parameter overrides go depending on the subcommand used. diff --git a/bigtop-deploy/puppet/modules/qfs/manifests/init.pp b/bigtop-deploy/puppet/modules/qfs/manifests/init.pp new file mode 100644 index 00000000..c1602719 --- /dev/null +++ b/bigtop-deploy/puppet/modules/qfs/manifests/init.pp @@ -0,0 +1,161 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +class qfs { + class deploy($roles) { + if ("qfs-metaserver" in $roles) { + include qfs::metaserver + } + + if ("qfs-chunkserver" in $roles) { + include qfs::chunkserver + } + + if ("qfs-client" in $roles) { + include qfs::client + } + } + + class common($metaserver_host, $metaserver_port, $chunkserver_port, + $metaserver_client_port, $chunkserver_client_port) { + $cluster_key = sha1($metaserver_host) + $storage_dirs = suffix($hadoop::hadoop_storage_dirs, "/qfs") + + hadoop::create_storage_dir { $qfs::common::storage_dirs: } -> + file { $qfs::common::storage_dirs: + ensure => directory, + owner => root, + group => root, + mode => 0755, + } + } + + class metaserver { + include common + + package { "qfs-metaserver": + ensure => latest, + } + + $metaserver_conf = "/etc/qfs/MetaServer.prp" + file { $metaserver_conf: + content => template("qfs/MetaServer.prp"), + require => Package["qfs-metaserver"], + } + + file { [ + "${qfs::common::storage_dirs[0]}/metaserver", + "${qfs::common::storage_dirs[0]}/metaserver/transaction_logs", + "${qfs::common::storage_dirs[0]}/metaserver/checkpoint", + ]: + ensure => directory, + owner => qfs, + group => qfs, + mode => 0755, + before => Service['qfs-metaserver'], + require => [ + File[$qfs::common::storage_dirs[0]], + Package['qfs-metaserver'] + ], + } + + exec { "mkfs": + command => "/usr/bin/metaserver -c $metaserver_conf", + creates => "${qfs::common::storage_dirs[0]}/metaserver/checkpoint/latest", + user => qfs, + group => qfs, + require => [ + Package["qfs-metaserver"], + File[$metaserver_conf], + File["${qfs::common::storage_dirs[0]}/metaserver/checkpoint"], + ], + } + + if ($fqdn == $qfs::common::metaserver_host) { + service { "qfs-metaserver": + ensure => running, + require => [ + Package["qfs-metaserver"], + File[$metaserver_conf], + Exec["mkfs"], + ], + hasrestart => true, + hasstatus => true, + } + } + } + + class chunkserver { + include common + + package { "qfs-chunkserver": + ensure => latest, + } + + $chunkserver_conf = "/etc/qfs/ChunkServer.prp" + file { $chunkserver_conf: + content => template("qfs/ChunkServer.prp"), + require => Package["qfs-chunkserver"], + } + + $cs_dirs = suffix($hadoop::hadoop_storage_dirs, "/qfs/chunkserver") + $cs_chunks_dirs = suffix($hadoop::hadoop_storage_dirs, "/qfs/chunkserver/chunks") + $storage_dirs = concat($cs_dirs, $cs_chunks_dirs) + + file { $storage_dirs: + ensure => directory, + owner => qfs, + group => qfs, + mode => 0755, + before => Service['qfs-chunkserver'], + require => [ + File[$qfs::common::storage_dirs], + Package['qfs-chunkserver'] + ], + } + + service { "qfs-chunkserver": + ensure => running, + require => [ + Package["qfs-chunkserver"], + File[$chunkserver_conf] + ], + hasrestart => true, + hasstatus => true, + } + } + + class client { + include common + + package { [ + "qfs-client", + "qfs-hadoop", + "qfs-java", + ]: + ensure => latest, + } + + file { "/etc/qfs/QfsClient.prp": + content => template("qfs/QfsClient.prp"), + require => Package["qfs-client"], + } + + file { "/usr/bin/hadoop-qfs": + content => template("qfs/hadoop-qfs"), + mode => 0755, + } + } +} diff --git a/bigtop-deploy/puppet/modules/qfs/templates/ChunkServer.prp b/bigtop-deploy/puppet/modules/qfs/templates/ChunkServer.prp new file mode 100644 index 00000000..84875aea --- /dev/null +++ b/bigtop-deploy/puppet/modules/qfs/templates/ChunkServer.prp @@ -0,0 +1,280 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# The chunk server configuration. + +# The following parameters set must be specified at startup time. Other +# parameters can be changed at runtime, and it usually more convenient to +# specify these in meta server configuration. The meta server broadcasts the +# corresponding chunk server parameters to all connected chunk chunk servers. + +# The meta server location. +chunkServer.metaServer.hostname = <%= scope['qfs::common::metaserver_host'] %> +chunkServer.metaServer.port = <%= scope['qfs::common::metaserver_port'] %> + +# Client connection listener ip address to bind to. +# Use :: to bind to ipv6 address any. +# Default is empty, treaated as 0.0.0.0 ipv4 address any, unless the following +# parameter chunkServer.clientIpV6Only set to 1 +# chunkServer.clientIp = + +# Accept ipv4 chunk servers connections. +# Default is 0, enable acception ipv4 connection. Only has effect if +# chunkServer.clientIp left empty, or set to :: +# chunkServer.clientIpV6Only = 0 + +# Port to open for client connections +chunkServer.clientPort = <%= scope['qfs::common::chunkserver_client_port'] %> + +# Chunk server's rack id. Has effect only if meta server rack prefixes +# (metaServer.rackPrefixes) are nots set, or chunk server's ip do not match +# any of the prefix. +# Default is no rack assigned. +# chunkServer.rackId = -1 + +# Space separated list of directories to store chunks (blocks). +# Usually one directory per physical disk. More than one directory can +# be used in the cases where the host file system has problems / limitations +# large directories. +# The directories that are "not available" (don't exists, io errors, +# "evacuate.done" file exists, etc.) at a given moment directories are +# periodically scanned. +# If the directory becomes available while chunk server is running, chunk server +# deletes all chunk files in this directory (if any), and starts using this +# directory. +# All available directories are periodically scanned, if the directory becomes +# "unavailable" all chunks in this directory are declared lost, and the +# gets added to "not available" directories which are periodically scanned as +# described the above. +chunkServer.chunkDir = <%= scope['qfs::common::storage_dirs'][0] %>/chunkserver/chunks + +# Number of io threads (max. number of disk io requests in flight) per host file +# system. +# The typical setup is to have one host file system per physical disk. +# Even if raid controller is available jbod configuration is recommended, +# leaving the failure handling and striping to the distributed file system. +# The default is 2. +# With large requests (~1MB) two io requests in flight should be sufficient. +# chunkServer.diskQueue.threadCount = 2 + +# Number of "client" / network io threads used to service "client" requests, +# including requests from other chunk servers, handle synchronous replication, +# chunk re-replication, and chunk RS recovery. Client threads allow to use more +# than one cpu to perform network io, encryption / decryption, request parsing, +# checksum and RS recovery calculation. +# Multiple "client" threads might help to increase network throughput. With +# modern Intel CPUs, maximum single cpu throughput can be expected roughly +# 300MB/sec (under 3Gbps network) with 1MB average write request size, and +# approximately 150MB/sec with network encryption enabled. +# This parameter has effect only on startup. Please see meta server +# configuration for related parameters description. The parameters described in +# the meta server configuration can be changed at any time, and the changes +# will have effect without chunk server restart. +# Default is 0. All network io, replication and recovery is handled by the +# "main" thread. +# chunkServer.clientThreadCount = 0 + +# Set client thread affinity to CPU, starting from the specified CPU index. The +# first cpu index is 0. +# If the number of CPUs is less than start index plus the number of threads, the +# the threads affinity at start index plus CPU count will be set to the last +# CPU. +# Setting affinity might help to reduce processor ram cache misses. +# The parameter has effect only on startup, and has effect only on Linux OS. +# Default is -1, no cpu affinity set. +# chunkServer.clientThreadFirstCpuIndex = -1 + +# Set the cluster / fs key, to protect against data loss and "data corruption" +# due to connecting to a meta server hosting different file system. +chunkServer.clusterKey = <%= scope['qfs::common::cluster_key'] %> + +# Redirect stderror and out into /dev/null to handle the case where one or both +# are written into a file and the host file system / disk where the file resides +# exhibiting noticeable io stalls, or completely unavailable.. +# Normally all the log message output performed by the message writer (thread) +# that deals with log io stall by dropping log messages. This redirection is for +# extra safety to handle the case if some library function attempting to write +# into stdout / stderror. +chunkServer.stdout = /dev/null +chunkServer.stderr = /dev/null + +# The following parameter defines max size of io buffer memory used by the chunk +# server. +# The value set here, 128K means 128K * 4K buffer = 512M of buffers +# The default values are 64K (128MB) for 32 bit build, and 192K (768MB) for 64 +# bit build. +# The optimal amount of memory depends on the number of disks in the system, and +# the io (read, write) concurrency -- the number of concurrent "clients". The +# memory should be increased if large number of concurrent write appenders is +# expected. Ideally the disk io request should be around 1MB, thus for each +# chunk opened for append at least 1MB of io buffers is recommended. +chunkServer.ioBufferPool.partitionBufferCount = 131072 + +# Buffer manager portion of all io buffers. +# This value defines max amount of io buffers that can be used for servicing +# "client" requests, chunk re-replication, and recovery. +# The remaining (1 - chunkServer.bufferManager.maxRatio) used for write append +# buffering and other "internal" purposes. +# Default is 0.4 or 40% +# chunkServer.bufferManager.maxRatio = 0.4 + +# Set the following to 1 if no backward compatibility with the previous kfs +# releases required. 0 is the default. +# When set to 0 the 0 header checksum (all 8 bytes must be 0) is treated as +# no checksum and therefore no chunk file header checksum verification +# performed. +# The downside of the compatibility mode is that chunk server might not detect +# the cases where the host os zero fills the data during the host file system +# recovery / journal / transaction log replay, +# thus the data loss / corruption problem might not be detected. +# chunkServer.requireChunkHeaderChecksum = 0 + +# If set to a value greater than 0 then locked memory limit will be set to the +# specified value, and mlock(MCL_CURRENT|MCL_FUTURE) invoked. +# On linux running under non root user setting locked memory "hard" limit +# greater or equal to the specified value required. ulimit -l can be used for +# example. +# Default is 0 -- no memory locking. +# chunkServer.maxLockedMemory = 0 + +# Mlock io buffers memory at startup, if set to non 0. +# Default is 0 -- no io buffer memory locking. +# chunkServer.ioBufferPool.lockMemory = 0 + +# ---------------------------------- Message log. ------------------------------ + +# Set reasonable log level, and other message log parameter to handle the case +# when meta server not available, or doesn't accept this chunk server for any +# reason. +# The chunk servers message log configuration parameters including log level +# level can be changed in the meta server configuration file. +chunkServer.msgLogWriter.logLevel = INFO + +# Colon (:) separated file name prefixes to store log segments. +# Default is empty list. The default is to use file name from the command line +# or if none specified write into file descriptor 2 -- stderror. +# chunkServer.msgLogWriter.logFilePrefixes = + +# Maximum log segment size. +# Default is -1 -- unlimited. +# chunkServer.msgLogWriter.maxLogFileSize = -1 + +# Maximum number of log segments. +# Default is -1 -- unlimited. +# chunkServer.msgLogWriter.maxLogFiles = -1 + +# Max. time to wait for the log buffer to become available. +# When wait is enabled the request processing thread will wait for the log +# buffer disk io to complete. If the disk subsystem cannot handle the +# amount of logging it will slow down the request processing. +# For chunk servers keeping the default is strongly recommended to minimize +# dependency on the host's disk subsystem reliability and performance. +# Default is -1. Do not wait, drop log record instead. +# chunkServer.msgLogWriter.waitMicroSec = -1 + +# -------------------- Chunk and meta server authentication. ------------------- +# By default chunk and meta server authentication is off. +# +# If any of the following meta authentication methods is configured then chunk +# server requires QFS client connection to be authenticated. In other words, the +# QFS client, and other chunk server acting as a client, must obtain from the +# meta server chunk server access token and corresponding key and use this token +# and the key to authenticate with the chunk server. + +# ================= X509 authentication ======================================== +# +# Chunk server's X509 certificate file in PEM format. +# chunkserver.meta.auth.X509.X509PemFile = + +# Password if X509 PEM file is encrypted. +# chunkserver.meta.auth.X509.X509Password = + +# Chunk server's private key file. +# chunkserver.meta.auth.X509.PKeyPemFile = + +# Password if private key PEM file is encrypted. +# chunkserver.meta.auth.X509.PKeyPassword = + +# Certificate authorities file. Used for both meta server certificate +# validation and to create certificate chain with chunk server's X509 +# certificate. +# chunkserver.meta.auth.X509.CAFile = + +# Certificate authorities directory can be used in addition to CAFile. +# For more detailed information please see SSL_CTX_load_verify_locations manual +# page. CAFile/CADir corresponds to CAfile/CApath in the man page. +# chunkserver.meta.auth.X509.CADir = + +# If set (the default) verify peer certificate, and declare error if peer, i.e. +# meta server, does not preset "trusted" valid X509 certificate. +# Default is on. +# chunkserver.meta.auth.X509.verifyPeer = 1 + +# OpenSSL cipher configuration. +# chunkserver.meta.auth.X509.cipher = !ADH:!AECDH:!MD5:HIGH:@STRENGTH + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to SSL_OP_NO_COMPRESSION +# chunkserver.meta.auth.X509.options = + +# ================= Kerberos authentication ==================================== +# +# Kerberos principal: service/host@realm + +# Meta server's Kerberos principal [service/host@realm] service name part. +# chunkserver.meta.auth.krb5.service = + +# Meta server's Kerberos principal [service/host@realm] host name part. +# chunkserver.meta.auth.krb5.host = + +# Kerberos keytab file with the key(s) that corresponds to the chunk server's +# principal. +# chunkserver.meta.auth.krb5.keytab = + +# Chunk server's kerberos principal. krb5_parse_name() used to convert the name +# into the Kerberos 5 internal principal representation. +# chunkserver.meta.auth.krb5.clientName = + +# Force Kerberos client cache initialization during intialization. +# Default is off. +# chunkserver.meta.auth.krb5.initClientCache = 0 + +# OpenSSL cipher configuration for TLS-PSK authentication method. This method +# is used with delegation and with Kerberos authentication. +# chunkserver.meta.auth.psk.cipherpsk = !ADH:!AECDH:!MD5:!3DES:PSK:@STRENGTH + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to the logical OR of +# SSL_OP_NO_COMPRESSION and SSL_OP_NO_TICKET +# metaServer.clientAuthentication.psk.options = + +# ================= PSK authentication ========================================= +# +# PSK chunk server authentication is intended only for testing and possibly for +# small [test] clusters with very few chunk servers, where the same +# authentication credentials [PSK "key"] are used for for all chunk servers. + +# Chunk server's PSK key, Must be identical to the key the meta server +# configured with. See metaServer.CSAuthentication.psk.key parameter description +# in the annotated meta server configuration file. +# chunkserver.meta.auth.psk.key = + +# Chunk server's PSK key id. See metaServer.CSAuthentication.psk.keyId parameter +# description in the annotated meta server configuration file. +# chunkserver.meta.auth.psk.keyId = + +#------------------------------------------------------------------------------- diff --git a/bigtop-deploy/puppet/modules/qfs/templates/MetaServer.prp b/bigtop-deploy/puppet/modules/qfs/templates/MetaServer.prp new file mode 100644 index 00000000..0bcba747 --- /dev/null +++ b/bigtop-deploy/puppet/modules/qfs/templates/MetaServer.prp @@ -0,0 +1,1179 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# The meta server configuration. + +# Client listener port. +metaServer.clientPort = <%= scope['qfs::common::metaserver_client_port'] %> + +# Client connection listener ip address to bind to. +# Use :: to bind to ipv6 address any. +# Default is empty, treaated as 0.0.0.0 ipv4 address any, unless the following +# parameter metaServer.clientIpV6Only set to 1 +# metaServer.clientIp = + +# Accept ipv4 client connections. +# Default is 0, enable acception ipv4 connection. Only has effect if +# metaServer.clientIp left empty, or set to :: +# metaServer.clientIpV6Only = 0 + +# Chunk server connection listener ip address to bind to. +# Use :: to bind to ipv6 address any. +# Default is empty, treaated as 0.0.0.0 ipv4 address any, unless the following +# parameter metaServer.clientIpV6Only set to 1 +# metaServer.clientIp = + +# Accept ipv4 chunk servers connections. +# Default is 0, enable acception ipv4 connection. Only has effect if +# metaServer.clientIp left empty, or set to :: +# metaServer.clientIpV6Only = 0 + +# Chunk server listener port. +metaServer.chunkServerPort = <%= scope['qfs::common::chunkserver_port'] %> + +# Meta serve transactions log directory. +metaServer.logDir = <%= scope['qfs::common::storage_dirs'][0] %>/metaserver/transaction_logs + +# Meta server checkpoint directory. +metaServer.cpDir = <%= scope['qfs::common::storage_dirs'][0] %>/metaserver/checkpoint + +# Allow to automatically create an empty file system if checkpoint file does +# not exist. +# The default is 0, as under the normal circumstances where the file system +# content is of value, completely losing checkpoint, transaction log, and +# automatically creating an empty fs will have the same effect as conventional +# "mkfs". All chunks (blocks) will get deleted, and restoring the checkpoint +# and logs later won't be sufficient to recover the data. +# Use "-c" command line option to create new empty file system. For exampe: +# metaserver -c MetaServer.prp +# metaServer.createEmptyFs = 0 + +# Root directory permissions -- used only when the new file system created. +# metaServer.rootDirUser = 0 +# metaServer.rootDirGroup = 0 +# metaServer.rootDirMode = 0755 + +# Defaults for checkpoint and transaction log without permissions conversion on +# startup. +# metaServer.defaultLoadUser = 0 +# metaServer.defaultLoadGroup = 0 +# metaServer.defaultLoadFileMode = 0644 +# metaServer.defaultLoadDirMode = 0755 + +# The size of the "client" thread pool. +# When set to greater than 0, dedicated threads to do client network io, request +# parsing, and response assembly are created. The thread pool size should +# usually be (at least one) less than the number of cpus. "Client" threads help +# with processing large amount of ["short"] requests where more cpu used for +# context switch, network io, request parsing, and response assembly, than +# the cpu for the request processing itself. For example i-node attribute +# lookup, or write append chunk allocations that can be satisfied from the write +# append allocation cache. +# Default is 0 -- no dedicated "client" threads. +# metaServer.clientThreadCount = 0 + +# Meta server threads affinity. +# Presently only supported on linux. +# The first cpu index to set thread affinity to. +# The main thread will be assigned to the cpu at the specified index, then the +# next "client" thread will be assigned to the cpu index plus one and so on. +# For example with 2 client threads and start cpu index 0 the threads affinity +# would be 0 1 2 respectively. +# Useful on machines with more than one multi-core processor with shared dram +# cache. Assigning the threads to the same processor might help minimize dram +# cache misses. +# Default is off (start index less than 0) no thread affinity set. +# metaServer.clientThreadStartCpuAffinity = -1 + +# Meta server process max. locked memory. +# If set to a value greater than 0 then locked memory limit will be set to the +# specified value, and mlock(MCL_CURRENT|MCL_FUTURE) invoked. +# On linux running under non root user setting locked memory "hard" limit +# greater or equal to the specified value required. ulimit -l can be used for +# example. +# Default is 0 -- no memory locking. +# metaServer.maxLockedMemory = 0 + +# Size of [network] io buffer pool. +# The default buffer size is 4K, therefore the amount of memory is +# 4K * metaServer.bufferPool.partionBuffers. +# All io buffers are allocated at startup. +# If memory locking enabled io buffers are locked in memory at startup. +# Default is 256K or 1GB on 64 bit system, and 32K or 128MB on 32 bit system. +# metaServer.bufferPool.partionBuffers = 262144 + +# ============================================================================== +# The parameters below this line can be changed at runtime by editing the +# configuration file and sending meta server process HUP signal. +# Note that to restore parameter to default at run time the default value must +# be explicitly specified in the configuration file. In other words commenting +# out the parameter will not have any effect until restart. + +# WORM mode. +# Write once, read many mode. +# In this mode only modification of files ".tmp" (without quotes) suffix is +# allowed. +# Typically the application would create and write the file with ".tmp" suffix, +# and then rename it so the destination file name will not have ".tmp " suffix. +# To delete a file without ".tmp" suffix the mode can be temporary turned off +# by the administrator. "qfstoggleworm" utility, or temporary configuration +# modification can be used to do that. +# Default is 0. +# metaServer.wormMode = 0 + +# Mininum number of connected / functional chunk servers before the file system +# can be used. +# Default is 1. +# metaServer.minChunkservers = 1 + +# Wait 30 sec for chunk servers to connect back after restarting, before file +# system considered fully functional. +metaServer.recoveryInterval = 30 + +# Ignore master/slave chunk server assignment for write append. +# Master/slave assignment can help with append replication 2, so avoid +# theoretically possible IO buffers resource deadlock when chunk server A is a +# in one "AB" synchronous append replication chain, and chunk server B is a +# master in another chunk "BA" synchronous replication. +# In practice such deadlocks should be rare enough to matter, and, if occur, +# are resolve with replication timeout mechanism. +# The downside of using master/slave assignment is that presently it only works +# with replication 3, and only half of the chunk server population will be +# accepting client's append requests. +# Default is "on" -- ignore. +# metaServer.appendPlacementIgnoreMasterSlave = 1 + +# For write append use the low order bit of the IP address for the chunk servers +# master/slave assignment. This scheme is works well if least significant bit of +# ip address uniformly distributes masters and slaves withing the rack, +# especially with "in rack" placement for append. +# Default is 0. Assign master / slave to keep number of masters and slaves +# equal. The obvious downside of this is that the assignment depends on the +# chunk servers connection order. +# metaServer.assignMasterByIp = 0 + +# Chunk server executables md5 sums white list. +# The chunk server sends its executable md5sum when it connects to the meta +# server. If the following space separated list is not empty and does not +# does not contain the chunk server executable md5 sum then the chunk server +# is instructed to exit or restart itself. +# This might be useful for upgrade or versions control. +# While the chunk server is connected to the meta server no md5 sum verification +# performed. +# Default is empty list. +# metaServer.chunkServerMd5sums = + +# Unique file system id -- some name that uniquely identifies distributed file +# system instance. +# This is used to protect data loss / and or corruption in the case where chunk +# server(s) connect to the "wrong" meta server. +# The meta server will not accept connections from the chunk servers with a +# different "cluster key". +# Default is empty string. +metaServer.clusterKey = <%= scope['qfs::common::cluster_key'] %> + +# Assign rack id by ip prefix -- ip address treated as string. +# The prefix can be positioned with trailing ?? +# For example: 10.6.34.2? +# The rack id assigned on chunk server connect, and will not change until the +# chunk server re-connect. Therefore the configuration file changes will not +# have any effect until the chunk servers re-connect. +# Default is empty -- use rack id assigned in the chunk server config. +# metaServer.rackPrefixes = +# Example: +# 10.6.1.* -- rack 1, 10.6.2.* -- rack 2, 10.6.4.1? -- rack 4 etc. +# metaServer.rackPrefixes = 10.6.1. 1 10.6.2. 2 10.6.4.1? 4 10.6.4.1 5 + +# "Static" placement weights of the racks. The more weight and more chunk +# servers are in the rack the more likely the rack will be chosen for chunk +# allocation. +# Default is empty -- all weight are default to 1. +# metaServer.rackWeights = +# Example: Racks 1 and 2 have weight 1, rack 3 -- 0.9, rack 4 weight 1.2, +# rack 5 weight 1.5. All other rack weights are 1. +# metaServer.rackWeights = 1 1 2 1 3 0.9 4 1.2 5 1.5 + +# Various timeout settings. + +# Extend write lease expiration time by 30 sec. in the case of the write master +# disconnect, to give it a chance to re-connect. +# Default is 30 sec. Production value is 60 sec. +# metaServer.leaseOwnerDownExpireDelay = 30 + +# Re-replication or recovery delay in seconds on chunk server down, to give +# chunk server a chance to re-connect. +# Default is 120 sec. +# metaServer.serverDownReplicationDelay = 120 + +# Chunk server heartbeat interval. +# Default is 30 sec. +# metaServer.chunkServer.heartbeatInterval = 30 + +# Chunk server operations timeouts. +# Heartbeat timeout results in declaring chunk server non operational, and +# closing connection. +# All other operations timeout are interpreted as the operation failure. +# The values are in seconds. +# The defaults: +# metaServer.chunkServer.heartbeatTimeout = 60 +# metaServer.chunkServer.chunkReallocTimeout = 75 +# metaServer.chunkServer.chunkAllocTimeout = 40 +# metaServer.chunkServer.chunkReallocTimeout = 75 +# metaServer.chunkServer.makeStableTimeout = 330 +# metaServer.chunkServer.replicationTimeout = 330 + +# The current production values. +# metaServer.chunkServer.heartbeatInterval = 18 +# metaServer.chunkServer.heartbeatTimeout = 30 +# metaServer.chunkServer.chunkReallocTimeout = 18 +# metaServer.chunkServer.chunkAllocTimeout = 18 +# metaServer.chunkServer.makeStableTimeout = 60 + +# Other chunk server operations timeout. +# metaServer.chunkServer.requestTimeout = 600 + +# Chunk server space utilization placement threshold. +# Chunk servers with space utilization over this threshold are not considered +# as candidates for the chunk placement. +# Default is 0.95 or 95%. +# metaServer.maxSpaceUtilizationThreshold = 0.95 + +# Unix style permissions +# Space separated list of ip addresses of hosts where root user is allowed. +# Empty list means that root user is allowed on any host. +# Default is empty. +# metaServer.rootHosts = + +# File modification time update resolution. Increasing the value will reduce the +# transaction log writes with large files. +# Default is 1 sec. +# metaServer.MTimeUpdateResolution = 1 + +# --------------- File create limits. ------------------------------------------ +# +# Disallow specific file types. The list is space separate file type ids. +# Default is empty list. All valid file types are allowed. +# metaServer.createFileTypeExclude = + +# Limit number of data stripes for all file types.If create attempt exceeds +# the limit the meta server returns "permission denied". +# Default is the max supported by the compile time constants. +# metaServer.maxDataStripeCount = 511 + +# Limit number of recovery stripes for all file types. If create attempt exceeds +# the limit the meta server returns "permission denied". +# Default is 32. +# Max supported by the compile time constants in common/kfstypes.h is 127. +# metaServer.maxRecoveryStripeCount = 32 + +# Limit number of data stripes for files with recovery. +# Default is 64. +# Max supported by the compile time constants in common/kfstypes.h is 511. +# metaServer.maxRSDataStripeCount = 64 + +# Max number of replicas for "regular / replicated" file with no recovery. +# If create, or change replication requests exceeds this limit then the meta +# server replaces the value with the value specified. +# metaServer.maxReplicasPerFile = 64 + +# Max number of replicas for RS (file with recovery). +# If create, or change replication requests exceeds this limit then the meta +# server replaces the value with the value specified. +# metaServer.maxReplicasPerRSFile = 64 + +# Force effective user to root. This effectively turns off all permissions +# control. +# Default is off. +# metaServer.forceEUserToRoot = 0 + +# Client backward compatibility. +# Defaults are no user and no group -- no backward compatibility. +# metaServer.defaultUser = 0xFFFFFFFF +# metaServer.defaultGroup = 0xFFFFFFFF +# metaServer.defaultFileMode = 0644 +# metaServer.defaultDirMode = 0755 + +# The chunk server disconnects history size. Useful for monitoring. +# Default is 4096 slots / disconnect events. +# metaServer.maxDownServersHistorySize = 4096 + +# Space and placement re-balancing. +# Space re-balancing is controlled by the next two parameters (thresholds) below. +# Re-balancing constantly scans all chunks in the system and checks chunk +# placement within the replication or RS groups, and moves chunks from chunk +# servers that are above metaServer.maxRebalanceSpaceUtilThreshold to the chunk +# servers that are below metaServer.minRebalanceSpaceUtilThreshold. +# Default is 1 -- on. +# metaServer.rebalancingEnabled = 1 + +# Space re-balancing thresholds. +# Move chunk from the servers that exceed the +# metaServer.maxRebalanceSpaceUtilThreshold +# Default is 0.82 +# metaServer.maxRebalanceSpaceUtilThreshold = 0.82 + +# Move chunks to server below metaServer.minRebalanceSpaceUtilThreshold. +# Default is 0.72. +# metaServer.minRebalanceSpaceUtilThreshold = 0.72 + +# Time interval in seconds between replication queues scans. +# The more often the scan is scheduled the more cpu can potentially use. +# Default is 5 sec. +# metaServer.replicationCheckInterval = 5 + +# Re-balance scan depth. +# Max number of chunks to scan in one partial scan. The more chunks are scanned +# the more cpu re-balance will use, and the "faster" it will scan the chunks. +# metaServer.maxRebalanceScan = 1024 + +# Single re-balance partial scan time limit. +# Default is 0.03 sec. +# metaServer.maxRebalanceRunTime = 0.03 + +# Minimum time between two consecutive re-balance partial scans. +# Default is 0.512 sec. +# metaServer.rebalanceRunInterval = 0.512 + +# Max. number of a single client connection requests in flight. +# The higher value might reduce cpu and alleviate "head of the line blocking" +# when single client connection shared between multiple concurrent file readers +# and writers, potentially at the cost of reducing "fairness" between the client +# connections. Increasing the value could also reduce number of context +# switches, and os scheduling overhead with the "client" threads enabled. +# Default is 16 if the "client" threads are enabled, and 1 otherwise. +# metaServer.clientSM.maxPendingOps = 16 + +# ------------------ Chunk placement parameters -------------------------------- + +# The metaServer.sortCandidatesByLoadAvg and +# metaServer.sortCandidatesBySpaceUtilization are mutially exclusive. +# metaServer.sortCandidatesBySpaceUtilization takes precedence over +# metaServer.sortCandidatesByLoadAvg if both set to 1 + +# When allocating (placing) a chunk prefer chunk servers with lower "load" +# metric over the chunk servers with the higher "load" metric. +# For the write intensive file systems turning this mode on is +# recommended. +# Default is 0. Do not take chunk server "load" metric into the account. +# metaServer.sortCandidatesByLoadAvg = 0 + +# When allocating (placing) a chunk prefer chunk servers with lower disk space +# utilizaiton. +# Default is 0. Do not take space utilization into the account. +# metaServer.sortCandidatesBySpaceUtilization = 0 + +# When allocating (placing) a chunk do not consider chunk server with the "load" +# exceeding average load multiplied by metaServer.maxGoodCandidateLoadRatio. +# Default is 4. +# metaServer.maxGoodCandidateLoadRatio = 4 + +# When allocating (placing) a chunk do not consider chunk server with the "load" +# exceeding average "master" chunk server load multiplied by +# metaServer.maxGoodMasterLoadRatio if the chunk server is used as master (head +# or synchronous replication chain). +# Default is 4. +# metaServer.maxGoodMasterLoadRatio = 4 + +# When allocating (placing) a chunk do not consider chunk server with the "load" +# exceeding average "slave" load multiplied by metaServer.maxGoodSlaveLoadRatio +# if the chunk server is used as slave. +# Default is 4. +# metaServer.maxGoodSlaveLoadRatio = 4 + +# When allocating (placing) a chunk do not consider chunk server with the +# average number of chunks opened for write per drive (disk) exceeding average +# number of chunks opened for write across all disks / chunks servers multiplied +# by metaServer.maxWritesPerDriveRatio. +# Default is 1.5. +# metaServer.maxWritesPerDriveRatio = 1.5 + +# When allocating (placing) a chunk do not consider chunk server running on the +# same host as writer if the average number of chunks opened for write per drive +# (disk) exceeding average number of chunks opened for write across all disks / +# chunks servers multiplied by metaServer.maxLocalPlacementWeight. +# Default is 1.0. +# metaServer.maxLocalPlacementWeight = 1.0 + +# "In rack" placement for append and non append chunk allocations. +# Place chunk replicas on the same rack to save cross rack bandwidth at the cost +# of reduced reliability. Useful for temporary / scratch file systems. +# Default is 0. +# metaServer.inRackPlacementForAppend = 0 + +# "In rack" placement for non append files. +# Default is 0 - place replicas and chunks from the same RS blocks on different +# racks. +# metaServer.inRackPlacement = 0 + +# Limit number of re-replications (this does not include RS chunk recovery), +# that the given chunk server can be used as replication "source". +# Default is 10. +# metaServer.maxConcurrentReadReplicationsPerNode = 10 + +# Limit max concurrent chunk re-replications and RS recoveries per chunk server. +# Default is 5. +# metaServer.maxConcurrentWriteReplicationsPerNode = 5 + +#------------------------------------------------------------------------------- + +# Order chunk replicas locations by the chunk "load average" metric in "get +# alloc" responses. The read client logic attempts to use replicas in this +# order. +# Default is 0. The replicas locations are shuffled randomly. +# metaServer.getAllocOrderServersByLoad = 0 + +# Delay recovery for the chunks that are past the logical end of file in files +# with Reed-Solomon redundant encoding. +# The delay is required to avoid starting recovery while the file is being +# written into, and the chunk sizes aren't known / final. The writer can stop +# writing into a file, and the corresponding chunks write leases might timed +# out, and will be automatically revoked. The existing writer logic sets logical +# EOF when it closes the file, before that the logical file size remains 0 +# during write. (Unless it is re-write which is currently for all practical +# purposes not supported with RS files). The timeout below should be set to +# at least the max. practical file "write" time. +# Setting the timeout to a very large value will prevent processing the chunks +# sitting in the replication delayed queue from the "abandoned" files, i.e. +# files that the writer wrote something and then exited without closing the +# file. +# The parameter and the corresponding "delay" logic will likely be removed in +# future releases, and replaced with the write lease renew logic. +# Default is 6 hours or 21600 seconds. +# metaServer.pastEofRecoveryDelay = 21600 + +# Periodic checkpointing. +# If set to -1 checkpoint is disabled. In such case "logcompactor" can be used +# periodically create new checkpoint from the transaction logs. +# Default is 3600 sec. +# metaServer.checkpoint.interval = 3600 + +# Checkpoint lock file name. Can be used to serialize checkpoint write and load +# with external programs, for example logcompactor. +# Default is empty -- no lock file used. +# metaServer.checkpoint.lockFileName = + +# Max consecutive checkpoint write failures. +# Meta server will exit if checkpoint write fails +# metaServer.checkpoint.maxFailedCount times in the row for any reason (not +# enough disk space for example). +# Default is 2. +# metaServer.checkpoint.maxFailedCount = 2 + +# Checkpoint write timeout. Max time the checkpoint write can take before +# declaring write failure. +# Default is 3600 sec. +# metaServer.checkpoint.writeTimeoutSec = 3600 + +# Use synchronous mode to write checkpoint, i.e. tell host os to flush all data +# to disk prior to write system call return. +# The main purpose is to reduce the number of "dirty" / unwritten pages in the +# host os vm subsystem / file system buffer cache, therefore reducing memory +# contention and lowering the chances of paging out meta server and other +# processes with no memory locking. +# Default is on. +# metaServer.checkpoint.writeSync = 1 + +# Checkpoint write buffer size. +# The buffer size should be adequate with synchrounous write mode enabled, +# especially if journal and data of host's file system are on the same spinning +# media device, in order to minimize the number of seeks. +# Default is 16MB. +# metaServer.checkpoint.writeBufferSize = 16777216 + +# ---------------------------------- Audit log. -------------------------------- + +# All request headers and response status are logged. +# The audit log records are null ('\0') separated. +# The log could be useful for debugging and audit purposes. +# The logging require some cpu, but the main resource consumption is disk io. +# Default is off. +# metaServer.clientSM.auditLogging = 0 + +# Colon (:) separated file name prefixes to store log segments. +# Default is empty list. +# metaServer.auditLogWriter.logFilePrefixes = + +# Maximum log segment size. +# Default is -1 -- unlimited. +# metaServer.auditLogWriter.maxLogFileSize = -1 + +# Maximum number of log segments. +# Default is -1 -- unlimited. +# metaServer.auditLogWriter.maxLogFiles = -1 + +# Max. time to wait for the log buffer to become available. +# When wait is enabled the request processing thread will wait for the log +# buffer disk io to complete. If the disk subsystem cannot keep up with the +# logging it will slow down the meta server request processing. +# Default is -1. Do not wait, drop log record instead. +# metaServer.auditLogWriter.waitMicroSec = -1 + +#------------------------------------------------------------------------------- + +# ---------------------------------- Message log. ------------------------------ + +# Message log level FATAL, ALERT, CRIT, ERROR, WARN, NOTICE, INFO, DEBUG +# Defaul is DEBUG, except for non debug builds with NDEBUG defined INFO is +# default. +metaServer.msgLogWriter.logLevel = INFO + +# Colon (:) separated file name prefixes to store log segments. +# Default is empty list. The default is to use file name from the command line +# or if none specified write into file descriptor 2 -- stderror. +# metaServer.msgLogWriter.logFilePrefixes = + +# Maximum log segment size. +# Default is -1 -- unlimited. +# metaServer.msgLogWriter.maxLogFileSize = -1 + +# Maximum number of log segments. +# Default is -1 -- unlimited. +# metaServer.msgLogWriter.maxLogFiles = -1 + +# Max. time to wait for the log buffer to become available. +# When wait is enabled the request processing thread will wait for the log +# buffer disk io to complete. If the disk subsystem cannot keep up with the +# logging it will slow down the meta server request processing. +# Default is -1. Do not wait, drop log record instead. +# metaServer.msgLogWriter.waitMicroSec = -1 + +#------------------------------------------------------------------------------- + +# -------------------- Chunk servers authentication. --------------------------- +# +# Authentication is off by default. Both X509 (ssl) and Kerberos authentication +# methods can be enabled at the same time. Chunk server can negotiate +# authentication method. If both Kerberos and X509 are configured on the chunk +# server and meta server then Kerberos authentication is used. +# Chunk and meta servers perform mutual authentication with authentication +# enabled. +# +# Use of X509 authentication is recommended in order to avoid KDC dependency. +# Chunk servers have to periodically request Kerberos tickets from KDC. The meta +# server enforces Kerberos ticket expiration time, by asking chunk server to +# re-authenticate when its ticket expires. Therefore KDC unavailability for any +# reason, including network communication outage, might result in chunk servers +# disconnects. Long enough KDC unavailability might result in unrecoverable data +# loss, due to the file system unability to perform replication and recovery +# in response to disk and node failures. +# +# Please see OpenSSL documentation for detailed description about X509 +# authentication configuration. +# src/test-scripts/qfsmkcerts.sh might be used as a simple example how to create +# and use certificate authority, and X509 certificates. + +# Maximum authenticated session lifetime. This limits authenticated session time +# for all authentication methods. In other words, the session [connection] must +# be re-authenticated if the authentication token (Kerberos ticket, or x509 +# certificate) "end time" is reached or authenticated session exists longer than +# the value of this parameter. +# Default is 24 hours. +# metaServer.clientAuthentication.maxAuthenticationValidTimeSec = 86400 + +# Check chunk server authenticated name against the user and group database. +# If enabled then the authenticated name must be present in the user database in +# order for chunk server to be accepted. +# Default is 0 (off), use only black and white lists, if configured, see below. +# metaServer.CSAuthentication.useUserAndGrupDb = 0 + +# ================= X509 authentication ======================================== +# Meta server's X509 certificate file in PEM format. +# metaServer.CSAuthentication.X509.X509PemFile = + +# Password if X509 PEM file is encrypted. +# metaServer.CSAuthentication.X509.X509Password = + +# Meta server's private key file. +# metaServer.CSAuthentication.X509.PKeyPemFile = + +# Password if private key PEM file is encrypted. +# metaServer.CSAuthentication.X509.PKeyPassword = + +# Certificate authorities file. Used for both chunk server certificate +# validation and to create certificate chain with meta server's X509 +# certificate. +# metaServer.CSAuthentication.X509.CAFile = + +# Certificate authorities directory can be used in addition to CAFile. +# For more detailed information please see SSL_CTX_load_verify_locations manual +# page. CAFile/CADir corresponds to CAfile/CApath in the man page. +# metaServer.CSAuthentication.X509.CADir = + +# If set (the default) verify peer certificate, and declare error if peer, i.e. +# chunk server, does not preset "trusted" valid X509 certificate. +# Default is on. +# metaServer.CSAuthentication.X509.verifyPeer = 1 + +# OpenSSL cipher configuration. +# metaServer.CSAuthentication.X509.cipher = !ADH:!AECDH:!MD5:HIGH:@STRENGTH + +# SSL/TLS session cache timeout. Session cache is only used with X509 +# authentication method, with non default client or server side openssl options +# that turns off use of tls session tickets. +# Default is 4 hours. +# metaServer.CSAuthentication.X509.session.timeout = 14400 + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to SSL_OP_NO_COMPRESSION +# metaServer.clientAuthentication.X509.options = + +# ================= Kerberos authentication ===================================== +# Kerberos principal: service/host@realm + +# Meta server's Kerberos principal [service/host@realm] service name part. +# metaServer.CSAuthentication.krb5.service = + +# Meta server's Kerberos principal [service/host@realm] host name part. +# metaServer.CSAuthentication.krb5.host = + +# Kerberos keytab file with the key(s) that corresponds to the meta server's +# principal. +# metaServer.CSAuthentication.krb5.keytab = + +# Copy keytab into memory keytab, if supported by the kerberos versions, to +# improve performance, and avoid disk access. +# Default is on. +# metaServer.CSAuthentication.krb5.copyToMemKeytab = 1 + +# Client's (chunk server) principal "unparse" mode. +# Can be set to space separated combination of the following modes: +# short noRealm display +# The result of the principal conversion to string is used as client's +# (chunk server's) "authenticated name". +# The default is fullly qualified principal name. For chunk servers it +# would typically be in the form of service/host@realm. +# The "unparsed" chunk server name is checked against "black" and "white" chunk +# server list names as described below. +# metaServer.CSAuthentication.krb5.princUnparseMode = + +# OpenSSL cipher configuration for TLS-PSK authentication method. This method +# is used with TLS-PSK and with Kerberos authentication. +# metaServer.CSAuthentication.psk.cipherpsk = !ADH:!AECDH:!MD5:!3DES:PSK:@STRENGTH + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to the logical OR of +# SSL_OP_NO_COMPRESSION and SSL_OP_NO_TICKET +# metaServer.CSAuthentication.psk.options = + +# ================= PSK authentication ========================================= +# PSK chunk server authentication is intended only for testing and possibly for +# small [test] clusters with very few chunk servers, where the same +# authentication credentials [PSK "key"] are used for for all chunk servers. + +# Chunk server PSK key id. This string sent to the chunk as TLS PSK "hint", and +# also used as chunk server "authenticated name". +# This effectively overrides chunk server key id. +# If chunk server key id set to non empty string, then it can be left empty. +# In such case chunk server key id is used as authenticated name. The chunk +# server key id sent as "clear text" as part of ssl handshake, and is not +# "tied" in any way known to the meta server logic to the key id, therefore any +# "name" can be used. In other words the key is the only real security +# "credential" with this authentication scheme. +# The resulting chunk server name must not be empty, and pass "black" and +# "white" list check, see below. +# metaServer.CSAuthentication.psk.keyId = + +# Chunk server PSK key (the "pre-shared-key"). The same key must be used on the +# chunk server side in order for psk authentication to work. +# The default is empty key -- PSK authentication is not enabled. +# The key must be base 64 encoded, i.e. it must be valid base 64 sequence. +# metaServer.CSAuthentication.psk.key = + +# ================= Chunk servers's "black" and "white" lists ================== + +# Chunk server's X509 common names and/or kerberos names, "black" ("revocation") +# list. If chunk server's authenticated name matches one of the name in this +# list the authentication will fail. The names in the list are must be +# separated by spaces. Names with white space symbols are not supported. +# metaServer.CSAuthentication.blackList = + +# Chunk server's X509 common names and/or kerberos names, "white list". Unless +# the list is empty the chunk server's authenticated name must match one of the +# names in the list. +# metaServer.CSAuthentication.whiteList = + +#------------------------------------------------------------------------------- + +# -------------------- User / "client" authentication. ------------------------- +# Client X509 and kerberos authentication parameters only differ from chunk +# server's authentication parameters by metaServer.clientAuthentication prefix. +# The defaults are identical to chunk server authentication. + +# Maximum authenticated session lifetime. This limits authenticated session time +# for all authentication methods. In other words, the session [connection] must +# be re-authenticated if the authentication token (delegation token, Kerberos +# ticket, or x509 certificate) "end time" is reached or authenticated session +# exists longer than the value of this parameter. +# Default is 24 hours. +# metaServer.clientAuthentication.maxAuthenticationValidTimeSec = 86400 + +# ================= X509 authentication ======================================== +# Meta server's X509 certificate file in PEM format. +# metaServer.clientAuthentication.X509.X509PemFile = + +# Password if X509 PEM file is encrypted. +# metaServer.clientAuthentication.X509.X509Password = + +# Meta server's private key file. +# metaServer.clientAuthentication.X509.PKeyPemFile = + +# Password if private key PEM file is encrypted. +# metaServer.clientAuthentication.X509.PKeyPassword = + +# Certificate authorities file. Used for both chunk server certificate +# validation and to create certificate chain with meta server's X509 +# certificate. +# metaServer.clientAuthentication.X509.CAFile = + +# Certificate authorities directory can be used in addition to CAFile. +# For more detailed information please see SSL_CTX_load_verify_locations manual +# page. CAFile/CADir corresponds to CAfile/CApath in the manual page. +# metaServer.clientAuthentication.X509.CADir = + +# If set (the default) verify peer certificate, and declare error if peer, i.e. +# QFS client, does not preset certificate. +# Default is on. +# metaServer.clientAuthentication.X509.verifyPeer = 1 + +# OpenSSL cipher configuration for X509 authentication method. +# metaServer.clientAuthentication.X509.cipher = !ADH:!AECDH:!MD5:HIGH:@STRENGTH + +# SSL/TLS session cache timeout. Session cache is only used with X509 +# authentication method, with non default client or server side openssl options +# that turns off use of tls session tickets. +# Default is 4 hours. +# metaServer.clientAuthentication.X509.session.timeout = 14400 + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to SSL_OP_NO_COMPRESSION +# metaServer.clientAuthentication.X509.options = + +# ================= Kerberos authentication ===================================== +# Kerberos principal: service/host@realm + +# Meta server's Kerberos principal [service/host@realm] service name part. +# metaServer.clientAuthentication.krb5.service = + +# Meta server's Kerberos principal [service/host@realm] host name part. +# metaServer.clientAuthentication.krb5.host = + +# Kerberos keytab file with the key(s) that corresponds to the meta server's +# principal. +# metaServer.clientAuthentication.krb5.keytab = + +# Copy keytab into memory keytab, if supported by the kerberos versions, to +# improve performance, and avoid disk access. +# Default is on. +# metaServer.clientAuthentication.krb5.copyToMemKeytab = 1 + +# Client's principal "unparse" mode. +# Can be set to space separated combination of the following modes: +# short noRealm display +# The result of the principal conversion to string is used as client's +# (client's) "authenticated name". +# The default is fully qualified principal name. For users this typically would +# it would be in the form user@realm +# The resulting authentication name should match password database, the meta +# server host uses. The recommended value is "short', discard realm if it +# matches the default the kerberos configuration's default realm. +# metaServer.clientAuthentication.krb5.princUnparseMode = + +# OpenSSL cipher configuration for TLS-PSK authentication method. This method +# is used with delegation and with Kerberos authentication. +# metaServer.clientAuthentication.psk.cipherpsk = !ADH:!AECDH:!MD5:!3DES:PSK:@STRENGTH + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to the logical OR of +# SSL_OP_NO_COMPRESSION and SSL_OP_NO_TICKET +# metaServer.clientAuthentication.psk.options = + +# The following two parameters and respective defaults are intended to allow +# non authenticated access for the meta server web UI from the local host. + +# Space separated list of host ips that RPC listed in the next parameter are +# permitted with no authentication. +# Default is 127.0.0.1 +# metaServer.clientAuthentication.noAuthOpsHostIps = 127.0.0.1 + +# Space separated list of RPC names that allowed with no authentication, if the +# client's host ip obtained with getpeername() call matches one of the ips in +# the preceding list. +# Default is RPCs used by the meta server web UI. +# metaServer.clientAuthentication.noAuthOps = PING GET_CHUNK_SERVERS_COUNTERS GET_CHUNK_SERVER_DIRS_COUNTERS GET_REQUEST_COUNTERS DISCONNECT + +# ================= Client's "black" and "white" lists ========================= + +# Client's (user) X509 common names and/or kerberos names, "black" +# ("revocation") list. If client's authenticated name matches one of the name in +# this list the authentication will fail. The names in the list are must be +# separated by spaces. Names with white space symbols are not supported. +# metaServer.clientAuthentication.blackList = + +# Client's X509 common names and/or kerberos names, "white list". Unless +# the list is empty the client's authenticated name must match one of the +# names in the list. +# metaServer.clientAuthentication.whiteList = + +# ================== Delegation ================================================ +# +# Delegation token expiration time limit. +# Default is 24 hours. +# metaServer.clientAuthentication.maxDelegationValidForTimeSec = 86400 + +# Do not limit delegation token end time to the meta server session credentials +# (Kerberos ticket or X509 certificate) end time. +# Default is 0. +# metaServer.clientAuthentication.delegationIgnoreCredEndTime = 0 + +================================================================================ + +# Allow to use "clear text" communication mode by performing SSL/TLS shutdown +# immediately after successful authentication completion. If enabled, the QFS +# client's corresponding setting defines the communication mode between the +# client and the write master. The "clear text" communication mode between chunk +# servers (synchronous replication, re-replication, and chunk recovery) will be +# used if this parameter set to "on". +# Using this mode might make sense in order reduce chunk server CPU utilization +# and/or possibly increase IO throughput, in the cases where chunk server +# communication channel is considered to have adequate security for the purpose +# at hands. +# Default if "off" / "no" +# metaServer.clientCSAllowClearText = 0 + +# Chunk server access token maximum lifetime. +# Chunk server access token time defines chunk access time limit. +# Chunk access tokens have 10 min time limit -- twice chunk lease time. The +# chunk server access token effectively defines maximum client and chunk server +# to chunk server connections lifetimes. The client and chunk servers attempt +# to obtain and use a new chunk server access token before the current token +# expires, and re-open connection with the newly obtained token. +# Default is 2 hours. +# metaServer.CSAccessValidForTimeSec = 7200 + +# The meta server limits the write lease end time to the max of the current time +# plus the value of the following parameter, and the authentication end time. +# The parameter is intended primarily for testing, to avoid spurious write +# retries with authentication maximum life time set to very small value -- 5sec. +# (The short authentication lifetime is used in order to test the +# re-authentication logic.) +# metaServer.minWriteLeaseTimeSec = 600 + +#------------------------------------------------------------------------------- + +# -------------------- User and group configuration. --------------------------- +# User and group database parameters. +# The meta server host's user and group configuration used for QFS file system +# when QFS authentication is enabled. The user and group database is used to map +# "authenticated names" obtained with Kerberos and X509 authentication methods +# to user and group ids, and to establish group membership. Authenticated names +# that have no corresponding user id, or user id that have no corresponding +# "user name" are considered invalid, and as the result the authentication +# fails. +# User and group id with value 4294967295 have special treatment. Access always +# denied for users with such id. +# Root user entry with name "root" and id 0 added if not present in the the user +# database, unless expliclity excluded with metaServer.userAndGroup.excludeUser +# parameter. +# With authentication enabled QFS client library does not use host's local user +# and group database, the meta server's host database is effectively used by all +# QFS clients. + +# Minimal user id to include in user name to id mapping. +# Default is 0. +# metaServer.userAndGroup.minUserId = 0 + +# Maximum user id to include in user name to id mapping. +# Default is 4294967295. +# metaServer.userAndGroup.maxUserId = 4294967295 + +# Minimal group id to include in group name to group id mapping. +# Default is 0. +# metaServer.userAndGroup.minGroupId = 0 + +# Maximum group id to include in group name to group id mapping. +# Default is 4294967295. +# metaServer.userAndGroup.maxGroupId = 4294967295 + +# Omit entries with user names if it has one of the specified prefixes. +# metaServer.userAndGroup.omitUserPrefix = + +# Omit entries with group names if it has one of the specified prefixes. +# Default is empty list. +# metaServer.userAndGroup.omitGroupPrefix = + +# Update / re-read user and group to id mappings with every N seconds. +# By default period updates are effectively disabled. The parameter reload with +# HUP signal can be used to trigger user and group information update. +# Default is 315360000. +# metaServer.userAndGroup.updatePeriodSec = 315360000 + +# Disable user and group initial loading and/or reloading. +# Default is enabled. +# metaServer.userAndGroup.disable = 0 + +# Space separated list of the user names to exclude when loading or updating +# user database. +# Default is empty list. +# metaServer.userAndGroup.excludeUser = + +# Space separated list of the group names to exclude when loading or updating +# group database. +# Default is empty list. +# metaServer.userAndGroup.excludeGroup = + +# Space separated list of the group names, where members of these groups +# have effective user id 0 -- root. +# Default is empty list. +# metaServer.userAndGroup.rootGroups = + +# Space separated list of the user names, where such users have effective user +# id 0 -- root. +# User with name root and id 0 always added, even if it isn't presnet or +# excluded from the user database. +# Default is empty list. +# metaServer.userAndGroup.rootUsers = + +# Space separated list of the user names. Specified users are allowed to +# perform meta server administrative requests: fsck, chunk server retire, +# toggle worm, recompute directory sizes, dump to chunk to servers map, +# dump replication candidates, check chunk leases, list open files. +# Default is root user. +# metaServer.userAndGroup.metaServerAdminUsers = root + +# Space separated list of group names. Members of these groups are allowed to +# perform meta server aministration described in the previous parameter's +# section. +# Default is empty list. +# metaServer.userAndGroup.metaServerAdminGroups = + +# Space separated list of the user names. Specified users are allowed to +# perform meta server status inquiry requests: ping, up servers, meta stats, get +# chunk servers counters, get chunk directory counters, get meta server request +# counters. +# Default is root user. +# metaServer.userAndGroup.metaServerStatsUsers = root + +# Space separated list of group names. Members of these groups are allowed to +# perform meta server status requests described in the previous parameter's +# section. +# Default is empty list. +# metaServer.userAndGroup.metaServerStatsGroups = + +# Space separated list of the user names. Specified users are allowed to +# renew and cancel delegation tokens that belong to other users. No delegation +# can be used, to perform delegation renew or cancel, i.e. Kerberos or X509 +# authentication methods must be used. +# Default is empty list. +# metaServer.userAndGroup.delegationRenewAndCancelUsers = + +# Space separated list of group names. Members of these groups are allowed to +# renew and cancel delegation tokens that belong to other users, as described +# in the previous parameter's section. +# metaServer.userAndGroup.delegationRenewAndCancelGruops = + +# -------------------- Meta server cryptographic keys. ------------------------- + +# Key lifetime. The value defines maximum time before the delegation tokens +# issued by the meta server expire, and have to be renewed. +# Default is 4 hours. +# metaServer.cryptoKeys.keyValidTimeSec = 14400 + +# Keys change period. +# Key lifetime minus key change period is the minim time before delegation token +# must be renewed. +# metaServer.cryptoKeys.keyChangePeriodSec = 7200 + +# Meta server crypto keys file name. +# File name to save meta server keys. +# Specify file name to save keys, in order to ensure that delegation tokens are +# persistent across meta server restarts. +# Default is none, no keys are not persistent across meta server restarts. +# metaServer.cryptoKeys.keysFileName = + +# ------------------- Meta server authentication override ---------------------- +# +# Setting the following 3 parameters to the values specified below, will +# effectively disable client authentication. The chunk server authentication can +# still be enabled. +# Like the other parameters, removing or commenting out these parameters will not +# turn back QFS client authentication on, until chunk and meta servers restart. +# To turn back the QFS client authentication the parameters should be explicitly +# set back to the original / default values. +# +# Overriding the default behavior might be useful for initial authentication +# setup and/or debugging. Only two sets of values, the default and the set of +# inverted values work, any other combinations will not though could be useful +# for testing, i.e. 1 0 0 0 and 0 1 1 1 +# +# Default is 1 if QFS client authentication *not* configured, 0 otherwise. +# metaServer.clientAuthentication.authNone = 1 +# +# Default is 0 if QFS client authentication *not* configured, 1 otherwise. +# metaServer.clientCSAuthRequired = 0 +# +# Default is 0 if chunk and meta server server authentication is *not* +# configured, 1 otherwise. +# chunkServer.client.auth.enabled = 0 +# +# Default is 0 if chunk and meta server server authentication is *not* +# configured, 1 otherwise. +# chunkServer.remoteSync.auth.enabled = 0 + +# -------------------- File system ID ------------------------------------------ +# +# File system id is 64 bit file system identifier generated by the meta server +# at the time the file system is created. The file system id is used to protect +# against accidental use of chunk files that belong to a different file system. +# For example file system id should prevent use of stale chunk inventory that +# belongs to "old"/different file system, in the case where a new file file +# file system was created, and the same "cluster key" is used. + +# Require file system id in the chunk server hello. +# Default is 0, to maintain backward compatibility with file systems crated with +# no file system id. +# metaServer.fileSystemIdRequired = 0 + +# The following parameter might be used for temporary file systems, that might +# be intentionally re-created from scratch on meta server restart, or in the +# case of loss of transaction log and/or checkpoint. Use with extra caution. +# Default is 0, Do not delete chunks on file system id mismatch. Do not use +# chunk directories if chunk directory file system does not match. +# metaServer.deleteChunkOnFsIdMismatch = 0 + +#------------------------------------------------------------------------------- + +# ===================== Chunk servers configuration parameters. ================ +# Configuration parameters in the meta server configuration file take precedence +# over the chunk server configuration files. + +# ---------------------------------- Message log. ------------------------------ + +# Chunk server log level. +chunkServer.msgLogWriter.logLevel = NOTICE + +# Colon (:) separated file name prefixes to store log segments. +# Default is empty list. The default is to use file name from the command line +# or if none specified write into file descriptor 2 -- stderror. +# chunkServer.msgLogWriter.logFilePrefixes = + +# Maximum log segment size. +# Default is -1 -- unlimited. +# chunkServer.msgLogWriter.maxLogFileSize = -1 + +# Maximum number of log segments. +# Default is -1 -- unlimited. +# chunkServer.msgLogWriter.maxLogFiles = -1 + +# Max. time to wait for the log buffer to become available. +# When wait is enabled the request processing thread will wait for the log +# buffer disk io to complete. If the disk subsystem cannot keep up with the +# logging it will slow down the request processing. +# For chunk servers keeping the default is strongly recommended to minimize +# dependency on the host's disk subsystem reliability and performance. +# Default is -1. Do not wait, drop log record instead. +# chunkServer.msgLogWriter.waitMicroSec = -1 + +#------------------------------------------------------------------------------- + +# Disk io request timeout. +# Default is 270 sec. Production value is 40 sec. +# chunkServer.diskIo.maxIoTimeSec = 270 + +# Synchronous replication timeouts. +# Record append synchrounous replicaiton timeout. +# Default is 180 sec. Production value is 20 sec. +# chunkServer.recAppender.replicationTimeoutSec = 180 +# Write replication timeout. +# Default is 300 sec. Production value is 20 sec. +# chunkServer.remoteSync.responseTimeoutSec = 300 + +# Controls buffered io -- use os file system cache, instead of direct io on the +# os / file systems that support direct io (most file systems on linux). +# Default is off. +# It is conceivable that enabling buffered io might help with short reads for +# the "broadcast" / "web server" type of loads. For the "typical" large io (1MB) +# requests sequential type loads enabling caching will likely lower cluster +# performance due to higher system (os) cpu overhead, and memory contention. +# Default is off. +# chunkServer.bufferedIo = 0 + +# If sparse files, and in particular chunks aren't used (sequential write only +# for example) the following parameter can be set to 0. +# Default is 1 -- enabled. +# chunkServer.allowSparseChunks = 1 + +# The minimal amount of space in bytes that must be available in order for the +# chunk directory to be used for chunk placement (considered as "writable"). +# Default is chunk size -- 64MB plus chunk header size 16KB. +# chunkServer.minFsAvailableSpace = 67125248 + +# The minimal amount of space that must be available in order for the chunk +# directory to be used for chunk placement (considered as "writable"), expressed +# as part total host file system space. +# Default is 0.05 or 5%, or in other words stop using chunk directory when the +# host file system where the chunk directory resides reaches 95% space +# utilization. +# chunkServer.maxSpaceUtilizationThreshold = 0.05 + +# The "weight" of pending disk io in chunk placement. +# If set to 0 or less the pending io (number of io bytes in the disk queue) has +# no no effect on the placement (choosing chunk directory where to create chunk). +# If weight set to greater than 0, then the average pending io per chunk +# directory (host file system / disk) is calculated, as +# (total_pending_read_bytes * total_pending_read_weight + +# total_pending_write_bytes * total_pending_write_weight) / chunk_directory_count +# Chunk directories with pending_read + pending_write that exceed the value the +# above are taken out of the consideration for placement. +# Default is 0. Typical production value is 1.3 +# chunkServer.chunkPlacementPendingReadWeight = 0 +# chunkServer.chunkPlacementPendingWriteWeight = 0 + +# Averaging interval for calculating the average time the incoming "client's" +# requests spend in io buffer wait queue. The "average wait time" value used by +# the meta server for chunk placement. The average exponentially decays (IIR +# filter). +# Default is 20 sec. Typical production value is 8. +# chunkServer.bufferManager.waitingAvgInterval = 20 + +# "Not available" directories rescan interval in seconds. Default is 180 sec. +# (see comment in chunk server configuration file). +# chunkServer.dirRecheckInterval = 60 + +# The following parameter has effect only if client threads enabled, i.e. if +# chunkServer.clientThreadCount parameter set to a value greater than 0 in the +# chunk server configuration. +# If the value is less than chunkServer.clientThreadCount, then +# threads in range +# [chunkServer.client.firstClientThreadIndex, chunkServer.clientThreadCount) +# will be used to service all requests, except RS chunk recovery, otherwise the +# "main" thread will be used. +# chunkServer.client.firstClientThreadIndex = 0 + +# The following parameter has effect only if client threads enabled, i.e. if +# chunkServer.clientThreadCount parameter set to a value greater than 0 in the +# chunk server configuration. +# Limit number of client threads used for RS recovery. Each thread uses +# single dedicated connection to the meta server to perform RS recovery. +# If set to 0 or less, then the "main" thread is used to perform RS recovery. +# The client threads in the range +# [0, min(chunkServer.rsReader.maxRecoveryThreads, chunkServer.clientThreadCount) +# are used to preform RS recovery. +# Default is 5. The same value as +# metaServer.maxConcurrentWriteReplicationsPerNode default. +# chunkServer.rsReader.maxRecoveryThreads = 5 diff --git a/bigtop-deploy/puppet/modules/qfs/templates/QfsClient.prp b/bigtop-deploy/puppet/modules/qfs/templates/QfsClient.prp new file mode 100644 index 00000000..94ae962b --- /dev/null +++ b/bigtop-deploy/puppet/modules/qfs/templates/QfsClient.prp @@ -0,0 +1,137 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# The meta server configuration. + +# Where is the metaserver +metaServer.name = <%= scope['qfs::common::metaserver_host'] %> +metaServer.port = <%= scope['qfs::common::metaserver_port'] %> + +# -------------------- Client and meta server authentication. ------------------ +# By default QFS client and meta server authentication (client and chunk server +# authentication as a consequence) is off. +# +# If any of the following meta authentication method is configured then QFS +# client and the meta server perform mutual authentication. +# +# The QFS client configuration parameters can be specified also via environment +# variables: QFS_CLIENT_CONFIG and QFS_CLIENT_CONFIG and +# QFS_CLIENT_CONFIG_meta_server_ip_port. The later variable takes precedence. +# The dots in the meta server ip (or host name) are replaced with _ (underscore) +# symbols. The underscore symbol also used to separate meta server ip and port. +# The later, longer form allows to use configuration specific to a +# particular meta server, and mainly intended to be used with the QFS +# delegation where both the delegation token and the key can be passed via +# environment variables (see PSK authentication section below)), +# +# The two from environment vairable values are supported: +# 1. FILE:configuration_file_name +# 2. parameter_name1=parameter_value1 parameter_name2=parameter_value2... +# The second space separated key value pairs can be used to pass delegation +# token and the corresponding key. Both these must be obtained from the meta +# server via "delegate" request. See qfs tool help. +# For example: +# QFS_CLIENT_CONFIG_127_0_0_1_20000='client.auth.psk.keyId=AAAB9dYIWfKBXhXCI1jJ9gAAU0XunwAAAACMoK0z30ztT5S7k9slRuRdzy9CXmi1 client.auth.psk.keyId=P+4XRIBLLBvkICXWO+1aXBPUTMghEakkTk1T+RVsifR9NQ71E32KVd27y+2DbyC2' +# export QFS_CLIENT_CONFIG_127_0_0_1_20000 + + +# ================= X509 authentication ======================================== +# +# QFS client's X509 certificate file in PEM format. +# client.auth.X509.X509PemFile = + +# Password if X509 PEM file is encrypted. +# client.auth.X509.X509Password = + +# QFS client's private key file. +# client.auth.X509.PKeyPemFile = + +# Password if private key PEM file is encrypted. +# client.auth.X509.PKeyPassword = + +# Certificate authorities file. Used for both meta server certificate +# validation and to create certificate chain with QFS client's X509 +# certificate. +# client.auth.X509.CAFile = + +# Certificate authorities directory can be used in addition to CAFile. +# For more detailed information please see SSL_CTX_load_verify_locations manual +# page. CAFile/CADir corresponds to CAfile/CApath in the man page. +# client.auth.X509.CADir = + +# If set (the default) verify peer certificate, and declare error if peer, i.e. +# meta server, does not preset "trusted" valid X509 certificate. +# Default is on. +# client.auth.X509.verifyPeer = 1 + +# OpenSSL cipher configuration. +# client.auth.X509.cipher = !ADH:!AECDH:!MD5:HIGH:@STRENGTH + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to SSL_OP_NO_COMPRESSION +# client.auth.X509.options = + +# ================= Kerberos authentication ==================================== +# +# Kerberos service principal: service/host@realm + +# Meta server's Kerberos principal [service/host@realm] service name part. +# client.auth.krb5.service = + +# Meta server's Kerberos principal [service/host@realm] host name part. +# client.auth.krb5.host = + +# Normally kinit is sufficient for the user authentication. +# The following Kerberos parameters might be used in the case when another +# "service" acts as QFS client. + +# Kerberos keytab file with the key(s) that corresponds to the QFS client's +# principal, if used. Key table is typically used for service. +# client.auth.krb5.keytab = + +# QFS client's kerberos principal. krb5_parse_name() used to convert the name +# into the Kerberos 5 internal principal representation. +# client.auth.krb5.clientName = + +# Force Kerberos client cache initialization during intialization. +# Default is off. +# client.auth.krb5.initClientCache = 0 + +# OpenSSL cipher configuration for TLS-PSK authentication method. This method +# is used with delegation and with Kerberos authentication. +# client.auth.psk.cipherpsk = !ADH:!AECDH:!MD5:!3DES:PSK:@STRENGTH + +# The long integer value passed to SSL_CTX_set_options() call. +# See open ssl documentation for details. +# Default is the integer value that corresponds to the logical OR of +# SSL_OP_NO_COMPRESSION and SSL_OP_NO_TICKET +# metaServer.clientAuthentication.psk.options = + +# ================= PSK / delegation authentication ============================ +# +# Both delegation token and delegation key are expected to be valid base 64 +# encoded binary blobs -- the exact string representation returned by the +# delegation request. + +# QFS client delegation token, The token must be obtained via delegation request +# the meta server. Both the token and the corresponding key must be specified. +# client.auth.psk.keyId = + +# QFS client delegation key, The key must be obtained via delegation request to +# the meta server. +# client.auth.psk.key = + +#------------------------------------------------------------------------------- diff --git a/bigtop-deploy/puppet/modules/qfs/templates/hadoop-qfs b/bigtop-deploy/puppet/modules/qfs/templates/hadoop-qfs new file mode 100644 index 00000000..7041daf2 --- /dev/null +++ b/bigtop-deploy/puppet/modules/qfs/templates/hadoop-qfs @@ -0,0 +1,79 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# A simple wrapper script around hadoop which enables the use of qfs easily. +# Using the hadoop command with qfs requires users to specify overrides for +# various parameters (e.g. fs.defaultFS) which can become cumbersome to do on +# the command line every time. The goal here is to provide a clean user +# experience in using hadoop and qfs. + +export JAVA_LIBRARY_PATH=/usr/lib/qfs + +QFS_OPTS=" -Dfs.qfs.impl=com.quantcast.qfs.hadoop.QuantcastFileSystem" +QFS_OPTS="$QFS_OPTS -Dfs.defaultFS=qfs://<%= scope['qfs::common::metaserver_host'] %>:<%= scope['qfs::common::metaserver_client_port'] %>" +QFS_OPTS="$QFS_OPTS -Dfs.qfs.metaServerHost=<%= scope['qfs::common::metaserver_host'] %>" +QFS_OPTS="$QFS_OPTS -Dfs.qfs.metaServerPort=<%= scope['qfs::common::metaserver_port'] %>" + +usage() { + echo " +usage: $0 <options> + $0 is a simple wrapper around the standard hadoop command that allows for an + easy interface with qfs. A subset of the standard hadoop commands are + supported (where it makes sense). For all other commands, use the hadoop + command directly. + + Commands: + fs query the qfs filesystem + jar <jar> <main class> submit a job to hadoop that uses qfs + Note: the main class is required + distcp <srcurl> <desturl> copy file or directories recursively + Note: supports copy into qfs from hdfs + " + exit 1 +} + +if [ $# == 0 ]; then + usage +fi + +COMMAND=$1 +shift + +case $COMMAND in + # usage flags + --help|-h) + usage + ;; + fs|distcp) + hadoop $COMMAND $QFS_OPTS "$@" + ;; + jar) + if [ $# -lt 2 ]; then + echo "$COMMAND: jar file and main class are required" + usage + fi + + JAR_FILE=$1 + MAIN_CLASS=$2 + shift 2 + hadoop $COMMAND $JAR_FILE $MAIN_CLASS $QFS_OPTS "$@" + ;; + *) + echo "$COMMAND: unsupported -- try using the hadoop command directly" + usage + ;; +esac diff --git a/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp b/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp deleted file mode 100644 index ef7e5dfa..00000000 --- a/bigtop-deploy/puppet/modules/tachyon/manifests/init.pp +++ /dev/null @@ -1,79 +0,0 @@ -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -class tachyon { - - class deploy ($roles) { - if ("tachyon-master" in $roles) { - include tachyon::master - } - - if ("tachyon-worker" in $roles) { - include tachyon::worker - } - } - - class common ($master_host){ - package { "tachyon-tfs": - ensure => latest, - } - - # add logging into /var/log/.. - file { - "/etc/tachyon/conf/log4j.properties": - content => template("tachyon/log4j.properties"), - require => [Package["tachyon-tfs"]] - } - - # add tachyon-env.sh to point to tachyon master - file { "/etc/tachyon/conf/tachyon-env.sh": - content => template("tachyon/tachyon-env.sh"), - require => [Package["tachyon-tfs"]] - } - } - - class master { - include common - - exec { - "tachyon formatting": - command => "/usr/lib/tachyon/bin/tachyon format", - require => [ Package["tachyon-tfs"], File["/etc/tachyon/conf/log4j.properties"], File["/etc/tachyon/conf/tachyon-env.sh"] ] - } - - if ( $fqdn == $tachyon::common::master_host ) { - service { "tachyon-master": - ensure => running, - require => [ Package["tachyon-tfs"], Exec["tachyon formatting"] ], - hasrestart => true, - hasstatus => true, - } - } - - } - - class worker { - include common - - if ( $fqdn == $tachyon::common::master_host ) { - notice("tachyon ---> master host") - # We want master to run first in all cases - Service["tachyon-master"] ~> Service["tachyon-worker"] - } - - service { "tachyon-worker": - ensure => running, - require => [ Package["tachyon-tfs"], File["/etc/tachyon/conf/log4j.properties"], File["/etc/tachyon/conf/tachyon-env.sh"] ], - hasrestart => true, - hasstatus => true, - } - } -} diff --git a/bigtop-deploy/puppet/modules/tachyon/templates/tachyon-env.sh b/bigtop-deploy/puppet/modules/tachyon/templates/tachyon-env.sh deleted file mode 100755 index e3a5fb1b..00000000 --- a/bigtop-deploy/puppet/modules/tachyon/templates/tachyon-env.sh +++ /dev/null @@ -1,78 +0,0 @@ -#!/usr/bin/env bash -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -# This file contains environment variables required to run Tachyon. Copy it as tachyon-env.sh and -# edit that to configure Tachyon for your site. At a minimum, -# the following variables should be set: -# -# - JAVA_HOME, to point to your JAVA installation -# - TACHYON_MASTER_ADDRESS, to bind the master to a different IP address or hostname -# - TACHYON_UNDERFS_ADDRESS, to set the under filesystem address. -# - TACHYON_WORKER_MEMORY_SIZE, to set how much memory to use (e.g. 1000mb, 2gb) per worker -# - TACHYON_RAM_FOLDER, to set where worker stores in memory data -# - TACHYON_UNDERFS_HDFS_IMPL, to set which HDFS implementation to use (e.g. com.mapr.fs.MapRFileSystem, -# org.apache.hadoop.hdfs.DistributedFileSystem) - -# The following gives an example: - -if [[ `uname -a` == Darwin* ]]; then - # Assuming Mac OS X - export JAVA_HOME=${JAVA_HOME:-$(/usr/libexec/java_home)} - export TACHYON_RAM_FOLDER=/Volumes/ramdisk - export TACHYON_JAVA_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc=" -else - # Assuming Linux - if [ -z "$JAVA_HOME" ]; then - export JAVA_HOME=/usr/lib/jvm/java-7-oracle - fi - export TACHYON_RAM_FOLDER=/mnt/ramdisk -fi - -export JAVA="$JAVA_HOME/bin/java" - -echo "Starting tachyon w/ java = $JAVA " - -export TACHYON_MASTER_ADDRESS=<%= @master_host %> -export TACHYON_UNDERFS_ADDRESS=$TACHYON_HOME/underfs -#export TACHYON_UNDERFS_ADDRESS=hdfs://localhost:9000 -export TACHYON_WORKER_MEMORY_SIZE=1GB -export TACHYON_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem - -echo "TACHYON master => $TACHYON_MASTER_ADDRESS " - -CONF_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" - -export TACHYON_JAVA_OPTS+=" - -Dlog4j.configuration=file:$CONF_DIR/log4j.properties - -Dtachyon.debug=false - -Dtachyon.underfs.address=$TACHYON_UNDERFS_ADDRESS - -Dtachyon.underfs.hdfs.impl=$TACHYON_UNDERFS_HDFS_IMPL - -Dtachyon.data.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/data - -Dtachyon.workers.folder=$TACHYON_UNDERFS_ADDRESS/tmp/tachyon/workers - -Dtachyon.worker.memory.size=$TACHYON_WORKER_MEMORY_SIZE - -Dtachyon.worker.data.folder=$TACHYON_RAM_FOLDER/tachyonworker/ - -Dtachyon.master.worker.timeout.ms=60000 - -Dtachyon.master.hostname=$TACHYON_MASTER_ADDRESS - -Dtachyon.master.journal.folder=$TACHYON_HOME/journal/ - -Dorg.apache.jasper.compiler.disablejsr199=true - -Djava.net.preferIPv4Stack=true -" - -# Master specific parameters. Default to TACHYON_JAVA_OPTS. -export TACHYON_MASTER_JAVA_OPTS="$TACHYON_JAVA_OPTS" - -# Worker specific parameters that will be shared to all workers. Default to TACHYON_JAVA_OPTS. -export TACHYON_WORKER_JAVA_OPTS="$TACHYON_JAVA_OPTS" diff --git a/bigtop-deploy/puppet/modules/zeppelin/templates/zeppelin-env.sh b/bigtop-deploy/puppet/modules/zeppelin/templates/zeppelin-env.sh index 8095f010..ebcfc06d 100644 --- a/bigtop-deploy/puppet/modules/zeppelin/templates/zeppelin-env.sh +++ b/bigtop-deploy/puppet/modules/zeppelin/templates/zeppelin-env.sh @@ -18,6 +18,7 @@ export ZEPPELIN_WEBSOCKET_PORT=<%= @web_socket_port %> export ZEPPELIN_CONF_DIR=/etc/zeppelin/conf export ZEPPELIN_LOG_DIR=/var/log/zeppelin export ZEPPELIN_PID_DIR=/var/run/zeppelin +export ZEPPELIN_WAR_TEMPDIR=/var/run/zeppelin/webapps export ZEPPELIN_NOTEBOOK_DIR=/var/lib/zeppelin/notebook export MASTER=<%= @spark_master_url %> export SPARK_HOME=/usr/lib/spark diff --git a/bigtop-deploy/vm/utils/setup-env-debian.sh b/bigtop-deploy/vm/utils/setup-env-debian.sh index 3d34fe12..ac92615d 100755 --- a/bigtop-deploy/vm/utils/setup-env-debian.sh +++ b/bigtop-deploy/vm/utils/setup-env-debian.sh @@ -29,6 +29,7 @@ if [ $enable_local_repo == "true" ]; then echo "deb file:///bigtop-home/output/apt bigtop contrib" > /etc/apt/sources.list.d/bigtop-home_output.list apt-get update else + apt-get install -y apt-transport-https echo "local apt = $enable_local_repo ; NOT Enabling local apt. Packages will be pulled from remote..." fi diff --git a/bigtop-deploy/vm/utils/smoke-tests.sh b/bigtop-deploy/vm/utils/smoke-tests.sh index 8dac31c0..93f795e6 100755 --- a/bigtop-deploy/vm/utils/smoke-tests.sh +++ b/bigtop-deploy/vm/utils/smoke-tests.sh @@ -36,9 +36,17 @@ export SQOOP_HOME=/usr/lib/sqoop/ export HIVE_CONF_DIR=/etc/hive/conf/ export MAHOUT_HOME="/usr/lib/mahout" -su -s /bin/bash $HCFS_USER -c '/usr/bin/hadoop fs -mkdir /user/vagrant /user/root' -su -s /bin/bash $HCFS_USER -c 'hadoop fs -chmod 777 /user/vagrant' -su -s /bin/bash $HCFS_USER -c 'hadoop fs -chmod 777 /user/root' +prep() { + HADOOP_COMMAND=$1 + su -s /bin/bash $HCFS_USER -c "JAVA_LIBRARY_PATH=/usr/lib/qfs $HADOOP_COMMAND fs -mkdir /user/vagrant /user/root" + su -s /bin/bash $HCFS_USER -c "JAVA_LIBRARY_PATH=/usr/lib/qfs $HADOOP_COMMAND fs -chmod 777 /user/vagrant" + su -s /bin/bash $HCFS_USER -c "JAVA_LIBRARY_PATH=/usr/lib/qfs $HADOOP_COMMAND fs -chmod 777 /user/root" +} + +prep hadoop +if [[ $SMOKE_TESTS == *"qfs"* ]]; then + prep hadoop-qfs +fi if [ -f /etc/debian_version ] ; then apt-get -y install pig hive flume mahout sqoop diff --git a/bigtop-deploy/vm/vagrant-puppet-docker/README.md b/bigtop-deploy/vm/vagrant-puppet-docker/README.md index 79e50f78..602b10b0 100644 --- a/bigtop-deploy/vm/vagrant-puppet-docker/README.md +++ b/bigtop-deploy/vm/vagrant-puppet-docker/README.md @@ -157,4 +157,4 @@ See `bigtop-deploy/puppet/config/site.csv.example` for more details. ##Notes -* Users currently using vagrant 1.6+ is strongly recommanded to upgrade to 1.6.4+, otherwise you will encounter the [issue](https://github.com/mitchellh/vagrant/issues/3769) when installing plguins +* Users currently using vagrant 1.6+ is strongly recommanded to upgrade to 1.6.4+, otherwise you will encounter the [issue](https://github.com/mitchellh/vagrant/issues/3769) when installing plugins diff --git a/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_centos-7.yaml b/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_centos-7.yaml index 49de5737..92d7468f 100644 --- a/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_centos-7.yaml +++ b/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_centos-7.yaml @@ -21,7 +21,7 @@ boot2docker: memory_size: "4096" number_cpus: "1" -repo: "http://bigtop-repos.s3.amazonaws.com/releases/1.1.0/centos/6/x86_64" +repo: "http://bigtop-repos.s3.amazonaws.com/releases/1.1.0/centos/7/x86_64" distro: centos components: [hadoop, yarn] namenode_ui_port: "50070" diff --git a/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_ubuntu_ppc64le.yaml b/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_ubuntu_ppc64le.yaml new file mode 100644 index 00000000..d34ba914 --- /dev/null +++ b/bigtop-deploy/vm/vagrant-puppet-docker/vagrantconfig_ubuntu_ppc64le.yaml @@ -0,0 +1,32 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +docker: + memory_size: "4096" + image: "bigtop/deploy:ubuntu-15.04-ppc64le" + +boot2docker: + memory_size: "4096" + number_cpus: "1" + +repo: "http://bigtop-repos.s3.amazonaws.com/releases/1.1.0/ubuntu/vivid/ppc64el" +distro: debian +components: [hadoop, yarn] +namenode_ui_port: "50070" +yarn_ui_port: "8088" +hbase_ui_port: "60010" +enable_local_repo: false +smoke_test_components: [mapreduce, pig] +jdk: "openjdk-7-jdk" diff --git a/bigtop-deploy/vm/vagrant-puppet-openstack/vagrantconfig.yaml b/bigtop-deploy/vm/vagrant-puppet-openstack/vagrantconfig.yaml index 76633557..86077ddd 100644 --- a/bigtop-deploy/vm/vagrant-puppet-openstack/vagrantconfig.yaml +++ b/bigtop-deploy/vm/vagrant-puppet-openstack/vagrantconfig.yaml @@ -12,7 +12,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -repo: "http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.8.0/label=centos6/6/artifact/output/" +repo: "http://bigtop-repos.s3.amazonaws.com/releases/1.1.0/centos/6/x86_64" num_instances: 1 distro: centos components: [hadoop, yarn] diff --git a/bigtop-deploy/vm/vagrant-puppet-vm/README.md b/bigtop-deploy/vm/vagrant-puppet-vm/README.md index 2b21a706..ed06eca0 100644 --- a/bigtop-deploy/vm/vagrant-puppet-vm/README.md +++ b/bigtop-deploy/vm/vagrant-puppet-vm/README.md @@ -59,7 +59,7 @@ num_instances: 5 first, build up local yum repo ``` -cd bigtop; ./gradlew tachyon-yum +cd bigtop; ./gradlew alluxio-yum ``` and then enable local yum in vagrantconfig.yaml diff --git a/bigtop-deploy/vm/vagrant-puppet-vm/Vagrantfile b/bigtop-deploy/vm/vagrant-puppet-vm/Vagrantfile index 63aff201..b4017d0e 100755 --- a/bigtop-deploy/vm/vagrant-puppet-vm/Vagrantfile +++ b/bigtop-deploy/vm/vagrant-puppet-vm/Vagrantfile @@ -67,6 +67,7 @@ bigtop_master = "bigtop1.vagrant" $script = <<SCRIPT service iptables stop +service firewalld stop chkconfig iptables off # Remove 127.0.0.1 entry since vagrant's hostname setting will map it to FQDN, # which miss leads some daemons to bind on 127.0.0.1 instead of public or private IP |