Date   

Re: The file removal performance is very low by mdtest

Groot
 

Thanks, I want to know where I can get the description of --dfs.oclass and --dfs.dir_oclass?Do they contain other parameters except S1 and SX?


Re: The file removal performance is very low by mdtest

Chaarawi, Mohamad
 

Hi,

 

What is you mdtest command that you used to run?

Typically low remove performance might indicate that you are using SX object class for files (the default),, which is fine if your files are going to be large. However that’s not the case with mdtest.

So I would suggest to re-run the same command, but add:

--dfs.oclass S1 --dfs.dir_oclass SX

This tells DFS to use:

  • single stripe/shard for files (best for mdtest where files are small or empty)
  • full striping for directories (also best for mdtest where directories usually have a lot of files).

 

Thanks,

Mohamad

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Groot <kukougu@...>
Date: Tuesday, April 12, 2022 at 10:15 PM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] The file removal performance is very low by mdtest

I install the DAOS and use the mdtest to do some test with -a DFS (one server and one client with 20 cores, the server use 256G DCPM without nvme ssd). But I find the file removal performance is very low and the result shown as follow and I want to know if it is right? Or how to do the performance turning?


The config file of server is :

access_points: ['server-1']

port: 10001

transport_config:

    allow_insecure: false

    client_cert_dir: /etc/daos/certs/clients

    ca_cert: /etc/daos/certs/daosCA.crt

    cert: /etc/daos/certs/server.crt

    key: /etc/daos/certs/server.key

provider: ofi+verbs;ofi_rxm

control_log_mask: INFO

control_log_file: /tmp/daos_server.log

helper_log_file: /tmp/daos_admin.log

telemetry_port: 9191

 

engines:

-

    targets: 20

    nr_xs_helpers: 20

    fabric_iface: ib0

    fabric_iface_port: 31316

    log_mask: ERR

    log_file: /tmp/daos_engine_0.log

    scm_mount: /mnt/daos0

    scm_class: dcpm

    scm_list: [/dev/pmem0]

Thanks a lot
Groot


Re: dfuse mount error in centos 7

Hennecke, Michael
 

Yes that’s correct  – libioil requires the container to be dfuse-mounted. And note that all metadata operations will still go through dfuse, libioil only intercepts (most of) the data I/O calls.

 

Best,

Michael

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Groot
Sent: Wednesday, 13 April 2022 04:42
To: daos@daos.groups.io
Subject: Re: [daos] dfuse mount error in centos 7

 

OK, thanks. And I want to know if I must mount the dfuse before using the libioil interception library?

Thanks a lot.
Groot

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


The file removal performance is very low by mdtest

Groot
 

I install the DAOS and use the mdtest to do some test with -a DFS (one server and one client with 20 cores, the server use 256G DCPM without nvme ssd). But I find the file removal performance is very low and the result shown as follow and I want to know if it is right? Or how to do the performance turning?


The config file of server is :

access_points: ['server-1']

port: 10001

transport_config:

    allow_insecure: false

    client_cert_dir: /etc/daos/certs/clients

    ca_cert: /etc/daos/certs/daosCA.crt

    cert: /etc/daos/certs/server.crt

    key: /etc/daos/certs/server.key

provider: ofi+verbs;ofi_rxm

control_log_mask: INFO

control_log_file: /tmp/daos_server.log

helper_log_file: /tmp/daos_admin.log

telemetry_port: 9191

 

engines:

-

    targets: 20

    nr_xs_helpers: 20

    fabric_iface: ib0

    fabric_iface_port: 31316

    log_mask: ERR

    log_file: /tmp/daos_engine_0.log

    scm_mount: /mnt/daos0

    scm_class: dcpm

    scm_list: [/dev/pmem0]

Thanks a lot
Groot


Re: dfuse mount error in centos 7

Groot
 

OK, thanks. And I want to know if I must mount the dfuse before using the libioil interception library?

Thanks a lot.
Groot


Re: dfuse mount error in centos 7

Hennecke, Michael
 

Hi,

 

please make sure that you are on CentOS 7.9,  your kernel level indicates that you may be on an older level?

 

Best,

Michael

 

From: daos@daos.groups.io <daos@daos.groups.io> On Behalf Of Groot
Sent: Saturday, 9 April 2022 16:57
To: daos@daos.groups.io
Subject: [daos] dfuse mount error in centos 7

 

I create the pool and container with POSIX type successfully. But I mount the dfs by dfuse failed and I get the error message as follow:
fuse: error: filesystem requested capabilities 0x10000 that are not supported by kernel, aborting.
My system is Centos 7 and the kernel is 3.10.0-957. I install the daos services by the yum repo.

Thanks a lot.

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


dfuse mount error in centos 7

Groot
 

I create the pool and container with POSIX type successfully. But I mount the dfs by dfuse failed and I get the error message as follow:
fuse: error: filesystem requested capabilities 0x10000 that are not supported by kernel, aborting.
My system is Centos 7 and the kernel is 3.10.0-957. I install the daos services by the yum repo.

Thanks a lot.


DAOS Community Update / Apr'22

Lombardi, Johann
 

Hi there,

 

Please find below the DAOS community newsletter for April 2022.

 

Past Events

 

Upcoming Events

  • Salishan Conference On High Speed Computing (April 26th)
    Accelerating Data-driven Workflows with DAOS

    Johann Lombardi (Intel)
  • ECP BoF (May 11th)
    DAOS Next Generation Storage
    Kevin Harms (ANL)
    Mohamad Chaarawi (Intel)
    Johann Lombardi (Intel)
  • ISC BoF (May 30th or June 1st)
    Accelerating HPC and AI with DAOS Storage
    Kevin Harms (ANL)
    Michael Hennecke (Intel)
    Johann Lombardi (Intel)

 

Release

  • Current stable release is 2.0.2. See https://docs.daos.io/v2.0/ and https://packages.daos.io/v2.0/ for more information.
  • Branches:
    • release/2.0 is the release branch for the stable 2.0 release. Latest bug fix release is 2.0.2.
    • release/2.2 has been created earlier this week for the future 2.2 release. Latest test build is 2.1.101.
    • Master is the development branch for the future 2.4 release. Latest test build is 2.3.100.
  • Major recent changes on release/2.0:
    • Several documentation fixes
    • Several test fixes
    • Avoid holding CPU for too long in DTX
    • Fix intermittent failure that can occur when a node loses MS leadership and regains it
  • Major recent changes on release/2.2 and master (same for now):
    • Add UCX support to the CART and the control plane
    • Add UCX support to the RPM
    • Update dmg query command to show the list of excluded engines in a pool.
    • Add support to core_dump_filter to the server yaml file
    • Several aggregation/reclaim improvements
    • Add pool versioning and upgrade capability for interoperability with 2.0.
    • Add interoperability check to the control plane
    • Rewrite netdetect in the control plane
    • Report accurate mtime via libdfs/dfuse by using DAOS internal versioning scheme
    • Several coverity fixes
    • Remove CentOS 8 support and move build/testing to Rocky Linux 8.5
    • Add tensorflow-IO documentation
    • Use strict CPU binding for the engine
    • Improve response time of dmg pool list
  • What is coming:
    • 2.2.0 testing and code freeze
    • 2.4 features to land to master

 

R&D

  • Major features under development:
  • Pathfinding:
    • DAOS Pipeline API for active storage
      • Work in progress to support pipeline API in the engine
      • Use cases under development:
    • Leveraging the Intel Data Streaming Accelerator (DSA) to accelerate DAOS
      • Prototype leveraging DSA for VOS aggregation delivered
    • S3 support via a DAOS backend to Rados Gateway (RGW)
    • Block interface over DAOS using SPDK DAOS bdev
    • GPU data path optimizations

 

News

See https://events.linuxfoundation.org/sodacode/ for more information.

 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 5 208 026.16 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: dfs_lookup behavior for non-existent files?

Tuffli, Chuck
 

Mohamad

Thank you for the sanity check regarding dfs_lookup. After a little sleuthing, the application (evidently) was modifying the effective UID/GID around the time of that lookup. And it was this *ID change that made networking fail. With those calls changed, DFS is now doing what I expected/thought/hoped 🙂

--chuck


From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Chaarawi, Mohamad <mohamad.chaarawi@...>
Sent: Tuesday, April 5, 2022 5:21 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] dfs_lookup behavior for non-existent files?
 

Hi Chuck,

 

Neither dfs_lookup nor dfs_stat do set the st_ino in the stat buf.

The reason being is that files are uniquely identified by the daos object ID which is 128 bits (64 hi,  64 lo).

You can retrieve that using dfs_obj2id():

https://github.com/daos-stack/daos/blob/master/src/include/daos_fs.h#L316

 

now for the other error, that seems weird. The errors are coming from the network layer. At that point, are there any servers that are down or were killed (specifically the engine with rank 1)? This would explain the errors.

When I try this myself, I get ENOENT for lookup on “//.Trash” as expected.

 

Thanks,

Mohamad

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...>
Date: Tuesday, April 5, 2022 at 12:58 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] dfs_lookup behavior for non-existent files?

I'm porting an existing application to use DFS (DAOS v2.0.2) instead of POSIX and need help understanding the error messages printed to the console.

 

The code is using dfs_lookup() to retrieve the struct stat of a file. Note the implementation cannot use dfs_stat() as it requires valid values for fields such as st_ino that dfs_stat() does not provide. The code in question is:

 

int

d_lstat(const char * restrict path, struct stat * restrict sb)

{

    int rc;

    dfs_obj_t *obj = NULL;

 

    rc = dfs_lookup(dfs, path, O_RDONLY, &obj, NULL, sb);

    ...

 

If the file path exists (e.g. "/"), this works. But if the path, doesn't exist (e.g. "//.Trash"), the call to dfs_lookup() does not return. Instead, the console endlessly prints messages like:

 

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329315] mercury->msg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/na/na_ofi.c:2972

 # na_ofi_msg_send(): fi_tsend() failed, rc: -13 (Permission denied)

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329374] mercury->hg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/mercury_core.c:2727

 # hg_core_forward_na(): Could not post send for input buffer (NA_ACCESS)

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] hg   ERR  src/cart/crt_hg.c:1104 crt_hg_req_send_cb(0x1d0cd40) [opc=0x4070001 (DAOS) rpcid=0x63f8133700000008 rank:tag=1:2] RPC failed; rc: DER_HG(-1020): 'Transport layer mercury error'

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] object ERR  src/object/cli_shard.c:889 dc_rw_cb() RPC 1 failed, DER_HG(-1020): 'Transport layer mercury error'

 

Am I mis-using dfs_lookup() or using it incorrectly?

 

--chuck


Re: dfs_lookup behavior for non-existent files?

Chaarawi, Mohamad
 

Hi Chuck,

 

Neither dfs_lookup nor dfs_stat do set the st_ino in the stat buf.

The reason being is that files are uniquely identified by the daos object ID which is 128 bits (64 hi,  64 lo).

You can retrieve that using dfs_obj2id():

https://github.com/daos-stack/daos/blob/master/src/include/daos_fs.h#L316

 

now for the other error, that seems weird. The errors are coming from the network layer. At that point, are there any servers that are down or were killed (specifically the engine with rank 1)? This would explain the errors.

When I try this myself, I get ENOENT for lookup on “//.Trash” as expected.

 

Thanks,

Mohamad

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of Tuffli, Chuck <chuck.tuffli@...>
Date: Tuesday, April 5, 2022 at 12:58 AM
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: [daos] dfs_lookup behavior for non-existent files?

I'm porting an existing application to use DFS (DAOS v2.0.2) instead of POSIX and need help understanding the error messages printed to the console.

 

The code is using dfs_lookup() to retrieve the struct stat of a file. Note the implementation cannot use dfs_stat() as it requires valid values for fields such as st_ino that dfs_stat() does not provide. The code in question is:

 

int

d_lstat(const char * restrict path, struct stat * restrict sb)

{

    int rc;

    dfs_obj_t *obj = NULL;

 

    rc = dfs_lookup(dfs, path, O_RDONLY, &obj, NULL, sb);

    ...

 

If the file path exists (e.g. "/"), this works. But if the path, doesn't exist (e.g. "//.Trash"), the call to dfs_lookup() does not return. Instead, the console endlessly prints messages like:

 

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329315] mercury->msg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/na/na_ofi.c:2972

 # na_ofi_msg_send(): fi_tsend() failed, rc: -13 (Permission denied)

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329374] mercury->hg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/mercury_core.c:2727

 # hg_core_forward_na(): Could not post send for input buffer (NA_ACCESS)

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] hg   ERR  src/cart/crt_hg.c:1104 crt_hg_req_send_cb(0x1d0cd40) [opc=0x4070001 (DAOS) rpcid=0x63f8133700000008 rank:tag=1:2] RPC failed; rc: DER_HG(-1020): 'Transport layer mercury error'

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] object ERR  src/object/cli_shard.c:889 dc_rw_cb() RPC 1 failed, DER_HG(-1020): 'Transport layer mercury error'

 

Am I mis-using dfs_lookup() or using it incorrectly?

 

--chuck


Re: What will happen to DAOS if all SCM space is consumed?

Lombardi, Johann
 

Hi there,

 

Applications will get a DER_NOSPACE error. We currently don’t support serialization of metadata to SSDs.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "bob@..." <bob@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Friday 1 April 2022 at 09:12
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] What will happen to DAOS if all SCM space is consumed?

 

Hi all 

What will happen to DAOS when the meta data and small I/Os have consumed all the SCM capacity meanwhile the  SSDs still have enough space  to hold data ?  what is the strategy ? Does it move some 4K values or meta data (keys) to the SSDs and then reclaim their space for incoming data , or just simply stop accepting I/O request? 

                                                                                Regards 

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 5 208 026.16 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: the fault domain setting of the daos container

Lombardi, Johann
 

Hi there,

 

There is a gap in the support of rf_lvl which is not used by the placement algorithm yet. Please see https://daosio.atlassian.net/browse/DAOS-10215 to track progress on this.

 

Cheers,

Johann

 

From: <daos@daos.groups.io> on behalf of "dagouxiong2015@..." <dagouxiong2015@...>
Reply-To: "daos@daos.groups.io" <daos@daos.groups.io>
Date: Thursday 24 March 2022 at 05:02
To: "daos@daos.groups.io" <daos@daos.groups.io>
Subject: [daos] the fault domain setting of the daos container

 

Hello everyone:

Recently, I am studying the fault domain setting of the daos container, hoping to deploy multiple engines on a single physical node,

and the data of the object is distributed on different physical nodes , prevent the failure of a single physical node to cause data loss.

But when I configure

[root@server-1 ~]# daos cont create pool --label cont12 --type POSIX --properties rf:1  --properties rf_lvl:0

ERROR: daos: "rf_lvl" is not a settable property (valid: cksum,cksum_size,compression,dedup,dedup_threshold,ec_cell,encryption,label,rf,srv_cksum,status)


Code analysis can only support one fault
domain (by rank) layout.

/**

 * Level of fault-domain to use for object allocation

 * rank is hardcoded to 1, [2-254] are defined by the admin

 */

enum {

        DAOS_PROP_CO_REDUN_MIN        = 1,

        DAOS_PROP_CO_REDUN_RANK        = 1, /** hard-coded */

        DAOS_PROP_CO_REDUN_MAX        = 254,

};

 

In the current situation, if you want data redundancy on different physical nodes, do you have any good suggestions?

Does daos plan to support configurable in fault domain in the future?

 

Best regards!

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 5 208 026.16 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


dfs_lookup behavior for non-existent files?

Tuffli, Chuck
 

I'm porting an existing application to use DFS (DAOS v2.0.2) instead of POSIX and need help understanding the error messages printed to the console.

The code is using dfs_lookup() to retrieve the struct stat of a file. Note the implementation cannot use dfs_stat() as it requires valid values for fields such as st_ino that dfs_stat() does not provide. The code in question is:

int
d_lstat(const char * restrict path, struct stat * restrict sb)
{
    int rc;
    dfs_obj_t *obj = NULL;

    rc = dfs_lookup(dfs, path, O_RDONLY, &obj, NULL, sb);
    ...

If the file path exists (e.g. "/"), this works. But if the path, doesn't exist (e.g. "//.Trash"), the call to dfs_lookup() does not return. Instead, the console endlessly prints messages like:

04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329315] mercury->msg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/na/na_ofi.c:2972
 # na_ofi_msg_send(): fi_tsend() failed, rc: -13 (Permission denied)
04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] external ERR  # [6937851.329374] mercury->hg: [error] /builddir/build/BUILD/mercury-2.1.0rc4/src/mercury_core.c:2727
 # hg_core_forward_na(): Could not post send for input buffer (NA_ACCESS)
04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] hg   ERR  src/cart/crt_hg.c:1104 crt_hg_req_send_cb(0x1d0cd40) [opc=0x4070001 (DAOS) rpcid=0x63f8133700000008 rank:tag=1:2] RPC failed; rc: DER_HG(-1020): 'Transport layer mercury error'
04/04-16:28:50.90 xxxxx DAOS[1178648/1178648/0] object ERR  src/object/cli_shard.c:889 dc_rw_cb() RPC 1 failed, DER_HG(-1020): 'Transport layer mercury error'

Am I mis-using dfs_lookup() or using it incorrectly?

--chuck


What will happen to DAOS if all SCM space is consumed?

bob@...
 

Hi all 

What will happen to DAOS when the meta data and small I/Os have consumed all the SCM capacity meanwhile the  SSDs still have enough space  to hold data ?  what is the strategy ? Does it move some 4K values or meta data (keys) to the SSDs and then reclaim their space for incoming data , or just simply stop accepting I/O request? 

                                                                                Regards 


the fault domain setting of the daos container

dagouxiong2015@...
 

Hello everyone:

Recently, I am studying the fault domain setting of the daos container, hoping to deploy multiple engines on a single physical node,
and the data of the object is distributed on different physical nodes , prevent the failure of a single physical node to cause data loss.
But when I configure
[root@server-1 ~]# daos cont create pool --label cont12 --type POSIX --properties rf:1  --properties rf_lvl:0
ERROR: daos: "rf_lvl" is not a settable property (valid: cksum,cksum_size,compression,dedup,dedup_threshold,ec_cell,encryption,label,rf,srv_cksum,status)

Code analysis can only support one fault domain (by rank) layout.
/**
 * Level of fault-domain to use for object allocation
 * rank is hardcoded to 1, [2-254] are defined by the admin
 */
enum {
        DAOS_PROP_CO_REDUN_MIN        = 1,
        DAOS_PROP_CO_REDUN_RANK        = 1, /** hard-coded */
        DAOS_PROP_CO_REDUN_MAX        = 254,
};

In the current situation, if you want data redundancy on different physical nodes, do you have any good suggestions?
Does daos plan to support configurable in fault domain in the future?

Best regards!


Announcement: DAOS 2.0.2 is generally available

Prantis, Kelsey
 

All,

 

We are pleased to announce that DAOS 2.0.2 release is now generally available. Notable changes in this maintenance release includes following updates on top of DAOS 2.0.1:

 

  • mercury has been updated from 2.1.0~rc4-3 to 2.1.0~rc4-5. This fixes multiple issues with dmg pool destroy, including DAOS-9725 and DAOS-9006.
  • dfuse readahead caching has been disabled when write-through caching is enabled DAOS-9738.
  • An issue with EC aggregation has been fixed where it was running too frequently and consuming CPU cycles even when EC is not used DAOS-9926.
  • Go dependency has been updated to >= 1.17 DAOS-9908.
  • Hadoop dependency has been updated to 3.3.2 DAOS-10068.

 

There are a number of resources available for the release:

As always, feel free to use this mailing list for any issues you may find with the release, or our JIRA bug tracking system, available at https://daosio.atlassian.net/jira, or on our Slack channel, available at https://daos-stack.slack.com.

 

Regards,

 

Kelsey Prantis

Senior Software Engineering Manager

Super Compute Storage Architecture and Development Division

Intel

 

 


Re: High latency in metada write

shadow_vector@...
 

Hi Liang:
 
  Is there any result of array write test ? Is there something wrong with my test?


Best Regards!


Re: Jenkins test

Murrell, Brian
 

On Wed, 2022-03-02 at 17:29 -0800, dongfeier wrote:
Scripts not permitted to use staticMethod
org.codehaus.groovy.runtime.DefaultGroovyMethods getAt
java.lang.Object java.lang.String.
Ultimately this means that some code in an untrusted shared library is
trying to access a non-whitelisted groovy function.

Administrators can decide whether to approve or reject this
signature. ( http://172.20.18.132:8080/scriptApproval )
You *could* do the above with the security implications it involves,
but the correct solution is to use whitelisted methods.

Error when executing unsuccessful post condition:
org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException:
Scripts not permitted to use staticMethod
org.codehaus.groovy.runtime.DefaultGroovyMethods getAt
java.lang.Object java.lang.String
This is the method that is not whitelisted.

        at
org.jenkinsci.plugins.scriptsecurity.sandbox.whitelists.StaticWhiteli
st.rejectStaticMethod(StaticWhitelist.java:279)
        at
org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxIntercepto
r.onGetArray(SandboxInterceptor.java:476)
        at
org.kohsuke.groovy.sandbox.impl.Checker$11.call(Checker.java:484)
        at
org.kohsuke.groovy.sandbox.impl.Checker.checkedGetArray(Checker.java:
489)
        at
com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getArray(SandboxInvok
er.java:45)
        at
com.cloudbees.groovy.cps.impl.ArrayAccessBlock.rawGet(ArrayAccessBloc
k.java:21)
        at notifyBrokenBranch.call(notifyBrokenBranch.groovy:37)
And this is where it's being called from. It's here:

https://github.com/daos-stack/pipeline-lib/blob/03a6dd8f16808094e2ba2971e839707cd690c0a5/vars/notifyBrokenBranch.groovy#L37

It's the use of env[] that is the problem. One solution here is to
move that function to the trusted library at:

https://github.com/daos-stack/trusted-pipeline-lib

But it seems a more correct solution is to replace the env[NAME]
accesses to env."NAME" such as this (completely untested) PR does:

https://github.com/daos-stack/pipeline-lib/pull/291

Cheers,
b.


Re: Jenkins test

Pittman, Ashley M
 

 

I suspect that this is because you haven’t configured Jenkins to use a build user and it’s building your code as root.  Our dockerfiles use the uid of the caller to own the files so that Jenkins can copy files in/out of the container, and we didn’t think about the case that docker would be run as root.  In theory we could probably workaround this in the dockerfiles but I recommend you look at your Jenkins config first.

 

Ashley.

 

From: daos@daos.groups.io <daos@daos.groups.io> on behalf of dongfeier <15735154041@...>
Date: Thursday, 3 March 2022 at 03:16
To: daos@daos.groups.io <daos@daos.groups.io>
Subject: Re: [daos] Jenkins test

[Edited Message Follows]

Hello,
     

useradd: UID 0 is not unique
The command '/bin/sh -c useradd --no-log-init --uid $UID --user-group --create-home --shell /bin/bash             --home /home/daos daos_server' returned a non-zero code: 4

 

Jenkins not configured to notify users of failed builds.
Scripts not permitted to use staticMethod org.codehaus.groovy.runtime.DefaultGroovyMethods getAt java.lang.Object java.lang.String. Administrators can decide whether to approve or reject this signature.
Error when executing unsuccessful post condition:
org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: Scripts not permitted to use staticMethod org.codehaus.groovy.runtime.DefaultGroovyMethods getAt java.lang.Object java.lang.String
  at org.jenkinsci.plugins.scriptsecurity.sandbox.whitelists.StaticWhitelist.rejectStaticMethod(StaticWhitelist.java:279)
  at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetArray(SandboxInterceptor.java:476)
  at org.kohsuke.groovy.sandbox.impl.Checker$11.call(Checker.java:484)
  at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetArray(Checker.java:489)
  at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getArray(SandboxInvoker.java:45)
  at com.cloudbees.groovy.cps.impl.ArrayAccessBlock.rawGet(ArrayAccessBlock.java:21)
  at notifyBrokenBranch.call(notifyBrokenBranch.groovy:37)
  at WorkflowScript.run(WorkflowScript:1049)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.delegateAndExecute(ModelInterpreter.groovy:137)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.runPostConditions(ModelInterpreter.groovy:761)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.catchRequiredContextForNode(ModelInterpreter.groovy:395)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.catchRequiredContextForNode(ModelInterpreter.groovy:393)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.runPostConditions(ModelInterpreter.groovy:760)
  at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.each(CpsDefaultGroovyMethods:2030)
  at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.each(CpsDefaultGroovyMethods:2015)
  at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.each(CpsDefaultGroovyMethods:2056)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.runPostConditions(ModelInterpreter.groovy:750)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.runPostConditions(ModelInterpreter.groovy)
  at org.jenkinsci.plugins.pipeline.modeldefinition.ModelInterpreter.executePostBuild(ModelInterpreter.groovy:728)
  at ___cps.transform___(Native Method)
  at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.get(PropertyishBlock.java:74)
  at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)
  at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.fixName(PropertyishBlock.java:66)
  at sun.reflect.GeneratedMethodAccessor430.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
  at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
  at com.cloudbees.groovy.cps.Next.step(Next.java:83)
  at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
  at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
  at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
  at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
  at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
  at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
  at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
  at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:185)
  at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:402)
  at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$400(CpsThreadGroup.java:96)
  at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:314)
  at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:278)
  at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
  at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
  at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
I have found the cause of the problem because the uid is repeated when adding users. Thank you very much


Re: Storage Usage

d.korekovcev@...
 

If I manually remove hosts from conf

41 - 60 of 1606