DAOS Community Update / Nov'21


Lombardi, Johann
 

Hi there,

 

Please find below the DAOS community newsletter for November 2021.

 

Past Events (October)

 

Upcoming Events

  • SC’21 Tutorial (Nov 15)
    Practical Persistent Memory Programming: PMDK and DAOS
    Adrian Jackson (University of Edinburgh)
    Mohamad Chaarawi (Intel)
    Johann Lombardi (Intel)

    https://sc21.supercomputing.org/presentation/?id=tut134&sess=sess210
  • SC’21 BoF (Nov 16)
    Object-stores for HPC - a Devonian Explosion or an Extinction Event?
    Philippe Deniel, CEA
    John Bent, Seagate
    Tiago Quinto, ECMWF
    Johann Lombardi, Intel

    https://sc21.supercomputing.org/presentation/?id=bof122&sess=sess368
  • DAOS User Group (DUG’21) on Nov 19 from 8:45am to 1:30pm (Central time).
    Agenda and Zoom invite available at http://dug.daos.io
  • Intel SC21 Booth
    Dev led talk:
    DAOS Unleashes the Power in HPC Applications: QCT(Quanta) DevCloud Experience Sharing
    A Compact, Scalable, Efficient Data Collection Solution for Edge/IoT to Cloud by Zettar
    Fireside chat:
    Cambridge Service for Data Driven Discovery Enables Real-time Hospital Decision Support Systems to Improve Outcomes

 

Release

  • A new 2.0 test build (v1.3.106-tb) was tagged a few weeks ago and we are now working towards a release candidate.
  • The 2.0 release stream has been branched under release/2.0
  • Master is now the development branch for the future 2.2 release.
  • Major recent 2.0 changes:
    • Upgrade Libfabric to v1.13.2rc1 to grab some critical rxm/verbs fixes
    • Fix overflow in SPDK when issuing NVMe unmap (aka trim) operation to large SSDs
    • Fix a PMDK issue where a transaction can return the wrong error code if yielding in the TX_STAGE_NONE callbacks
    • Improve interface detection for verbs in the agent
    • Fix ULT leaks causing OOM issues when running for a long time
    • Fix races on GetAttachInfo operation in the DAOS agent
    • show VMD backing addresses in storage scan
    • Improve Prometheus exporter performance
    • Add many new tests
    • Avoid truncation from causing incorrect reads in dfuse
    • Add interception for mkstemp in the interception library
    • Several EC fixes
    • Move SWIM ULT to a separate core to avoid interference
    • Allow object discard on multiple objects in VOS
  • Major recent master changes:
    • Add new engine metrics for the VEA module (extent allocator)
    • Migrate several CI tests from the sockets to the tcp provider
    • Initial support for the CXI provider
    • Add build support for AlmaLinux and Rocky Linux
  • What is coming:
    • Addressing the last few 2.0 blockers

 

R&D

  • Major features under development:
    • Checksum scrubbing
    • LDMS plugin to export DAOS metrics (targeted for 2.2)
    • API to collect libdaos metrics to be integrated with Darshan (targeted for 2.2)
    • Multi-user dfuse (targeted for 2.2)
    • More aggressive caching in dfuse for AI APPs (targeted for 2.2)
    • Design for catastrophic recovery / fsck
  • Pathfinding:
    • MariaDB DAOS engine with predicate pushdown to the DAOS storage nodes
      • Prototyped DAOS MariaDB engine available here: https://github.com/daos-stack/mariadb
      • PR for pipeline API (#6238)
      • Work in progress to support pipeline API in the engine
    • Leveraging the Intel Data Streaming Accelerator (DSA) to accelerate DAOS
      • Prototype leveraging DSA for VOS aggregation

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris,
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Join {daos@daos.groups.io to automatically receive all group messages.