Ruby的輕量級ETL工具:Kiba

xf3f 9年前發布 | 25K 次閱讀 Kiba Ruby開發

編寫可靠,簡潔,完善的測試和可維護的數據處理代碼是棘手的。Kiba讓你能夠使用Ruby輕松定義和運行高質量的ETL (Extract-Transform-Load) jobs。

Kiba provides you with a DSL to define ETL jobs:

# declare a ruby method here, for quick reusable logic
def parse_french_date(date)
  Date.strptime(date, '%d/%m/%Y')
end

or better, include a ruby file which loads reusable assets

eg: commonly used sources / destinations / transforms, under unit-test

require_relative 'common'

declare a pre-processor: a block called before the first row is read

pre_process do

do something

end

declare a source where to take data from (you implement it - see notes below)

source MyCsvSource, 'input.csv'

declare a row transform to process a given field

transform do |row| row[:birth_date] = parse_french_date(row[:birth_date])

return to keep in the pipeline

row end

declare another row transform, dismissing rows conditionally by returning nil

transform do |row| row[:birth_date].year < 2000 ? row : nil end

declare a row transform as a class, which can be tested properly

transform ComplianceCheckTransform, eula: 2015

before declaring a definition, maybe you'll want to retrieve credentials

config = YAML.load(IO.read('config.yml'))

declare a destination - like source, you implement it (see below)

destination MyDatabaseDestination, config['my_database']

declare a post-processor: a block called after all rows are successfully processed

post_process do

do something

end</pre>

項目主頁:http://www.baiduhome.net/lib/view/home/1429931698244

 本文由用戶 xf3f 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!