Spark 訪問 Hive 執行 SQL

worthluv 8年前發布 | 32K 次閱讀 分布式/云計算/大數據

來自: http://my.oschina.net/Rayn/blog/606856


最近在使用 Spark 結合 Hive  來執行查詢操作。。跑了一個demo 出現如下錯誤:

01-20 14:49:41 [INFO] [storage.BlockManagerMaster(59)] BlockManagerMaster stopped
01-20 14:49:41 [INFO] [spark.SparkContext(59)] Successfully stopped SparkContext
01-20 14:49:41 [INFO] [scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint(59)] OutputCommitCoordinator stopped!

java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: hdfs://testserver:9000/usr/hive/warehouse on HDFS should be writable. Current permissions are: rwxrwxr-x

    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:162)
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:160)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:167)
    at com.rayn.hive.test.HiveTestCase.setUp(HiveTestCase.java:36)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)



首先說明一下,,這個權限跟配置文件里面的沒有關系。。。

經過我查看調試源碼,最終將問題定位在 hive-exec.1.2.1 的 SessionState.java 這個類的方法:

private Path createRootHDFSDir(HiveConf conf) throws IOException {
    Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, HiveConf.ConfVars.SCRATCHDIR));
    FsPermission writableHDFSDirPermission = new FsPermission((short)00733);
    FileSystem fs = rootHDFSDirPath.getFileSystem(conf);
    if (!fs.exists(rootHDFSDirPath)) {
      Utilities.createDirsWithPermission(conf, rootHDFSDirPath, writableHDFSDirPermission, true);
    }
    FsPermission currentHDFSDirPermission = fs.getFileStatus(rootHDFSDirPath).getPermission();
    if (rootHDFSDirPath != null && rootHDFSDirPath.toUri() != null) {
      String schema = rootHDFSDirPath.toUri().getScheme();
      LOG.debug(
        "HDFS root scratch dir: " + rootHDFSDirPath + " with schema " + schema + ", permission: " +
          currentHDFSDirPermission);
    } else {
      LOG.debug(
        "HDFS root scratch dir: " + rootHDFSDirPath + ", permission: " + currentHDFSDirPermission);
    }
    // If the root HDFS scratch dir already exists, make sure it is writeable.
    if (!((currentHDFSDirPermission.toShort() & writableHDFSDirPermission
        .toShort()) == writableHDFSDirPermission.toShort())) {
      throw new RuntimeException("The root scratch dir: " + rootHDFSDirPath
          + " on HDFS should be writable. Current permissions are: " + currentHDFSDirPermission);
    }
    return rootHDFSDirPath;
  }



請注意這行代碼:
FsPermission writableHDFSDirPermission = new FsPermission((short)00733);

這個是 Hive 驗證 hive的文件中 HDFS上面必須有寫入權限的初始化值。

拋出異常信息的代碼是在 :

/* If the root HDFS scratch dir already exists, make sure it is writeable. */
    if (!((currentHDFSDirPermission.toShort() & writableHDFSDirPermission
        .toShort()) == writableHDFSDirPermission.toShort())) {
      throw new RuntimeException("The root scratch dir: " + rootHDFSDirPath
          + " on HDFS should be writable. Current permissions are: " + currentHDFSDirPermission);
    }
注釋也已經說明了,確保路徑存在并且具有寫權限。

根據這里,我寫了測試方法測試代碼及結果如下:

/**
     * 權限測試
     *
     * @throws Exception
     */
    @Test
    public void testPressmission() throws Exception
    {
        List<String> s475 = new ArrayList<String>();

        FsPermission writableHDFSDirPermission = new FsPermission((short) 00733);
        for (int key = 0; key < 512; key++)
        {
            writableHDFSDirPermission.fromShort((short) key);
            String permissionStr = writableHDFSDirPermission.toString();

            //log.info(i + " -- " + permissionStr);
            if ((key & 475) == 475)
            {
                log.info("key = {}, permission = {}", key, writableHDFSDirPermission.toString());
                s475.add(String.valueOf(key));
            }

            if (StringUtils.equalsIgnoreCase("rwxrwxrwx", permissionStr))
            {
                break;
            }
        }
    }



測試結果為:

01-20 14:57:12 [INFO] [test.OtherTestCase(59)] key = 475, permission = rwx-wx-wx
01-20 14:57:12 [INFO] [test.OtherTestCase(59)] key = 479, permission = rwx-wxrwx
01-20 14:57:12 [INFO] [test.OtherTestCase(59)] key = 507, permission = rwxrwx-wx
01-20 14:57:12 [INFO] [test.OtherTestCase(59)] key = 511, permission = rwxrwxrwx


Process finished with exit code 0

我就是執行 Select column1 from tablename 類似這樣的語句,也沒有寫入的東西。。。


搞的人很惱火。。。。。。




最后解決辦法:

下載源碼,注釋掉這行,

進入 $HIVE_HOME/sq  目錄下,,,

執行命令:  MVN clean install -Phadoop-2 -DskipTests 進行打包發布。正常使用!







 本文由用戶 worthluv 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!