您现在的位置 >> Hadoop教程 >> Hadoop实战 >> flume专题  
 

Flume安装和使用说明

【作者:Hadoop实战专家】【关键词:节点 host 数据 】 【点击:91226次】【2013-03-2】
Similar to tail source but follows multiple files. port, port. This is syslog compatible. Console sink. Textfile sink. Write serialized Flume events to a dfs path such as hdfs://namenode/file or file:///file in Hadoop’s seqfile format.  

相关热门搜索:hadoop 删除节点步骤 hadoop节点 hadoop 删除节点

大数据标签:hadoop hdfs hive zookeeper flume bigdata

1.flume简介

Flume是Cloudera提供的日志收集系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可定制)的能力。

Flume是一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统。

2.安装和使用说明:

2.1 安装

a. 下载: http://archive.cloudera.com/cdh/3/ flume-0.9.0+1.tar.gz

接着解压.暂时用$flume代表解压路径.

b. 用户文档:http://archive.cloudera.com/cdh/3/flume/UserGuide.html

c. 下载: http://archive.cloudera.com/cdh/3/ zookeeper-3.3.1.tar.gz

d. 安装zookeeper

yum install hadoop-zookeeper –y

yum install hadoop-zookeeper-server –y

修改/zookeeper-3.3.1/conf/ zoo_sample.cfg重命名为zoo.cfg

执行如下命令:

export  ZOOKEEPER_HOME=/home/hadoop/zookeeper-3.3.1

export  FLUME_HOME=/home/hadoop/flume-0.9.0+1

export  PATH=.:$FLUME_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH

2.2 使用

执行>flume

输出如下:

usage: flume command [args...]

commands include:

dump            Takes a specified source and dumps to console

node            Start a Flume node/agent (with watchdog)

master          Start a Flume Master server (with watchdog)

version         Dump flume build version information

node_nowatch    Start a flume node/agent (no watchdog)

master_nowatch  Start a Flume Master server (no watchdog)

class    Run specified fully qualified class using Flume environment (no watchdog)

ex: flume com.cloudera.flume.agent.FlumeNode

classpath       Dump the classpath used by the java executables

shell           Start the flume shell

启动flume的master节点执行:bin/flume master

通过flume打开文件

输入命令

$ flume dump 'tail("/home/hadoop/log/bb.txt")'

输出:

1.png (12.53 KB, 下载次数: 0)

  

2013-11-12 21:41 上传

通过flume导入文件到hdfs

可打开http://10.1.27.30:35871/flumemaster.jsp 即可看到整理节点的情况

从上面URL打开的选项卡config,输入节点配置,然后点提交查询内容

如下:

2.png (23.56 KB, 下载次数: 0)

  

2013-11-12 21:41 上传

Source为数据源,可有多种输入源,sink为接收器,当启动master节点时,会把文件写入到hdsf里

启动配置好的节点:bin/flume node –n master

通过flume读取syslog-ng

3.png (25.48 KB, 下载次数: 0)

  

2013-11-12 21:41 上传

分别启动节点host和collector节点

3.9.png (17.1 KB, 下载次数: 0)

  

2013-11-12 21:41 上传

4.png (26.96 KB, 下载次数: 0)

  

2013-11-12 21:41 上传

3.附录:

Flume Event

Sources console

Stdin console

text("filename")

One shot text file source. One line is one event

tail("filename")

Similar to Unix’s tail -F. One line is one event. Stays open for more data and follows filename if file rotated.

multitail("file1"[, "file2"[, …]])

Similar to tail source but follows multiple files.

asciisynth(msg_count,msg_size)

A source that synthetically generates msg_count random messages of size msg_size. This converts all characters into printable ASCII characters.

syslogUdp(port)

Syslog over UDP port, port. This is syslog compatible.

syslogTcp(port)

Syslog over TCP port, port. This is syslog-ng compatible.

Flume Event Sinks

null

Null sink. Events are dropped.

console[("format")]

Console sink. Display to console’s stdout. The "format" argument is optional and defaults to the "debug" output format.

text("txtfile"[,"format"])

Textfile sink. Write the events to text file txtfile using output format "format". The default format is "raw" event bodies with no metadata.

dfs("dfsfile")

DFS seqfile sink. Write serialized Flume events to a dfs path such as hdfs://namenode/file or file:///file in Hadoop’s seqfile format. Note that because of the hdfs write semantics, no data for this sink write until the sink is closed.

syslogTcp("host",port)

Syslog TCP sink. Forward to events to host on TCP port port in syslog wire format (syslog-ng compatible), or to other Flume nodes setup to listen for syslogTcp.

默认端口如下:

TCP ports are used in all situations.

node collector port

flume.collector.port

35853+

node status web server

flume.node.http.port

35862+

master status web server

flume.master.http.port

35871

master heartbeat port

flume.master.heartbeat.port

35872

master admin/shell port

flume.master.admin.port

35873

master gossip port

flume.master.gossip.port

35890

master → zk port

flume.master.zk.client.port

3181

zk → zk quorum port

flume.master.zk.server.quorum.port

3182

zk → zk election port

flume.master.zk.server.election.port

3183

大数据系列flume相关文章:

最新评论
也许2014-09-10 06:32:59
#小梦独家内训#【云计算分布式大数据Hadoop最佳实践】 本课程基于实务经验萃取而成,从Hadoop开发环境的搭建到到图片服务器、WordCount实现、HBase微博系统、话单查询与统计、Hive数据统计案例、电商业日志流量分析项目理论结合实际案例,助你轻松驾驭Hadoop!详细课纲请看http://t.cn/RvkRNoP
遗失的爱2014-09-09 04:12:20
国产化大数据平台是大势所趋,加入星环一起创业吧!//本土首家Hadoop厂商星环科技得到资本热捧 http://t.cn/RP6oEjT
hails2014-09-09 07:05:40
[图片]删除节点非常简单呢
手中的风筝2014-09-09 06:39:43
优化和自动化区别在那里呢
Sky2014-09-08 04:30:48
本周云计算内容精彩纷呈,有IBM发布新网络传输技术,速度最高可达400Gbps,四大著名风投眼中的大数据初创公司,Moz公司CEO结合自身创业实践解读如何理智使用AWS,还有Intel发布15核芯片以及其软硬兼备的Hadoop战略。 http://t.cn/8F8NK5n
14288966972014-09-08 10:30:16
阿里秋季校招了,明天要在线考试,谁知道是怎么样的吗
cabyu2014-09-07 08:19:01
异构计算是做什么的?
羽墨2014-09-07 04:01:59
一小段Java代码怎么在hadoop-2.2.0里边运行啊
灰太狼2014-09-06 11:48:36
Hadoop Development网络培训,hadoop源码级专家亲授 , 8周仅需390元(学费),期末可获"国家信息技术紧缺人才(NITE)Hadoop高级工程师职业技能证书", 报名就到:http://t.cn/8kRk20G 讲师: Louis目前就职于IBM公司
亮子2014-09-05 06:17:28
发表了博文 《华恩数据库班长训第二十四天》 - 华恩数据库班长训第二十四天 今天是华恩暑假长训的第二十四天了,是实验课,老师给我们提供了一个以前工作中的例子,基于Hadoop分布式集群平台的搭建,也就 http://t.cn/RP4qSN2
 
  • Hadoop生态系统资料推荐