## Current Projects:

### Fluid-Solid Interaction

Studying fluid-solid interaction (FSI) problems involves several aspects of scientific computing. Perhaps the more remarkable aspect is that in our group we use a Lagrangian-Lagrangian framework to tackle FSI problems. In this area, our technical effort is focused along three research thrusts:**1- Numerical method aspects:**The Smoothed Particles Hydrodynamics (SPH) is a meshless, Lagrangian numerical method in computational fluid dynamics (CFD) that closely matches the Lagrangian framework generally used in classical solid mechanics and multi-body dynamics. We are interested in improving the accuracy, robustness and efficiency of SPH. On the accuracy and robustness fronts, we are investigating (i) higher-order accurate discretization methods, (ii) higher-order accurate boundary condition enforcement, (iii) projection-based methods and higher-order fractional time-splitting approaches, (iv) variable resolution approaches; and, (v) incorporating fast linear-system solvers. These research areas pursue from different directions a unified solution. Their common denominator is achieving a better solver in terms of spatial/ temporal accuracy inside the domain as well as at the fluid-solid boundary.

**(Milad Rakhsha, Zubin Lal, Lijing Yang)**

**2- HPC aspects:**We are investigating ways to increase the speed of the CFD solver by making use of advanced computing techniques such as MPI, multi/many-core parallelism, GPU computing. Ongoing research focuses on mapping the FSI problem onto multiple GPUs to be able to scale up the problem size.

**(Milad Rakhsha, Justin Williams)**

**3- Applications:**We are currently investigating applications in biomechanics simulating the micro-structure of biological tissues such as articular cartilage. The challenges we are facing include (i) coupling the fluid solver to different types of flexible bodies such as 1D beam elements, 2D shell elements, 3D brick elements, and rigid-bodies with arbitrary shapes; and, (ii) implicit coupling between the fluid and solid as opposed to explicit force-displacement coupling.

**(Milad Rakhsha)**

Contributors: Milad Rakhsha, Zubin Lal, Lijing Yang, Justin Williams, Radu Serban, Dan Negrut

### Chrono Benchmarking for Next Generation NATO Reference Mobility Model (NG-NRMM)

As part of the development of the Next Generation NATO Reference Mobility Model (NG-NRMM), UW-Madison participated in an industry-wide multibody-dynamics simulation benchmarking activity that occurred in 2016 and 2017. Out of all the participants, Chrono was the only open source software solution. For Phase I of the Benchmarking activity, the lab built a model of an M113 replica utilizing the tracked vehicle capabilities of Chrono::Vehicle. This model was subsequently exercised in a variety of events. These events included a maximum swept radius during a turn in place maneuver, double lane change maneuvers, obstacle avoidance and crossing maneuvers, and several other events. For Phase II of the Benchmarking activity, the lab built a model of a 4-wheeled demonstration vehicle and exercised it through a similar, but slightly different, set of maneuvers as the Phase I Benchmarking event.Contributors: Radu Serban, Asher Elmquist, Rainer Gericke, Michael Taylor, Daniel Melanz, Dan Negrut

### Accelerating Nonlinear Finite Element Formulations for Multi-body Dynamics

Although formulations have been developed for integrating nonlinear flexible bodies within multibody dynamics frameworks, the runtime performance of these formulations is slow to the point where their usefulness is questionable. In this project, we seek to use software optimization techniques and new numerical techniques with the goal of producing speed-ups of 1 to 2 orders of magnitude. The results of a select set of benchmarking tests will be timed through the course of the project to identify the techniques leading to the largest speed gains. Our software and modeling advances will be continuously validated to ensure that solutions maintain their accuracy through the performance improvement process.Contributors: Mike Taylor, Radu Serban, Dan Negrut

### Representing Fluid Dynamics as a Rigid-Body Dynamics Problem with Friction and Contact

The purpose of this project is to understand whether fluid motion can be accurately represented as a very large collection of rigid spheres. The motion of fluids is traditionally modeled using the Navier-Stokes equations, whereas rigid body motion is governed by the Newton-Euler equations. This work attempts to achieve a high-resolution discrete representation of a continuum problem using a large count of bodies interacting via frictional contact. If possible, the use of rigid-body dynamics to simulate fluid dynamics would potentially open doors to new modeling approaches in many fields, including turbulence and fluid-solid interaction. Moreover, this technique would allow for faster simulations since its solution algorithm maps very well to modern Graphics Processing Unit (GPU) computing architectures.Conlain Kelly

### Synchrono: A Multi-Agent Simulation Framework for Robotics and Autonomous Vehicle Applications

Synchrono is a framework in which dynamic multi-agent simulations can be conducted to understand agent interplay and develop control algorithms in a safe and flexible environment. To create a virtual proving grounds for autonomous vehicles and robots, Synchrono enables multiple agents/vehicles to operate within the same virtual environment, and interact with each other through sensing and communication protocols. The vehicle and agent physics are conducted using Chrono and the additional Chrono modules. Continued development focuses on agent communication, sensor feedback, and enrichment of the virtual environment. Agent communication centers on both allowing agents to participate from geographically distributed locations, and allowing for agents to communicate using protocols such as DSRC or 5G. The goal of sensor modeling and simulation is to provide the control algorithms with physically realistic data obtained within the virtual world. To allow for this and more realistic physics, work is being done to use physical data to generate the virtual environment.##### Videos:

Asher Elmquist, Dylan Hatch, Radu Serban, Prof. Dan Negrut

### Distributed-Memory Granular Simulation

Granular material on nearly any practical scale involves computing the dynamics of billions of bodies. In order to generate high-resolution predictions for the behavior of granular material, a simulation must track each of the multiple billions of bodies and all of their associated properties. This amounts to a very large workload for a single computer running the simulation, often making full-resolution simulations impossible because of memory or time constraints. This project focuses on expanding the capabilities of granular simulation to include simulations with numbers of bodies on the order of billions. In order to do so, the simulation is broken into a number of smaller, mostly-disjoint sub-problems that can be solved each on a separate node of a computing cluster with minimal communication between nodes. The resulting software will be an open-source distributed-memory module of ProjectChrono, Chrono::Distributed, which will utilize the Message Passing Interface (MPI) standard to coordinate many instances of the multi-threading module, Chrono::Parallel, running on separate nodes in a cluster.Nic Olsen

### SPIKE GPU - An Implementation of a Recursive Divide-and-Conquer Parallel Strategy for Solving Large Systems of Linear Equations

This project proposes to investigate, produce, and maintain a methodology and its software implementation that leverage emerging heterogeneous hardware architectures to solve billion-unknowns linear systems in a robust, scalable, and efficient fashion. The two classes of problems targeted under this project are banded dense and sparse general linear systems. Preliminary results suggest that the adopted methodology displays a good strong-scaling attribute and its early implementation, called SPIKE, is one order of magnitude faster than competitive software solutions.Ang Li, Radu Serban, Dan Negrut

### Coupled Fluid-Flexible Body Investigation Using Chrono::Fluid

The interaction of fluid-flexible bodies was studied via a Lagrangian-Lagrangian framework, relying on Smoothed Particle Hydrodynamics, a general 3D rigid body dynamics, and an Absolute Nodal Coordinate Formulation (ANCF). The dynamics of the two phases, fluid and solid, are coupled with the help of Lagrangian markers, referred to as Boundary Condition Enforcing (BCE) markers which are used to impose no-slip and impenetrability conditions. Such BCE markers are associated both with the solid suspended particles and with any confining boundary walls and are distributed in a narrow layer on and below the surface of solid objects. The ensuing fluid-solid interaction forces are mapped into generalized forces on the rigid and flexible bodies and subsequently used to update the dynamics of the solid objects according to rigid body motion or ANCF method. The robustness and performance of the simulation algorithm is demonstrated through several numerical simulation studies.##### Videos:

Immersed Flexible Beams in Impulsively Started Channel FlowSPH-ANCF Model of Polymer Particles in Channel Flow

Arman Pazouki, Radu Serban, Dan Negrut

### Characterization of Xeon Phi with Linear Algebra Workloads

The efforts behind this independent study are to analyze how well suited Xeon Phi is for some frequently used linear algebra routines such as factorization and solvers. We are working with Intel MKL 11.1 on Xeon Phi based on KNC (MIC) architecture. The workloads under study include factorization and solving of dense and banded systems. Specifically, we are investigating the potential vectorization opportunities for such routines. The goal is to make use of all such opportunities that can help design hybrid banded spike-based solver.Omkar Deshmukh, Dan Negrut

### Performance Analysis of CULA on different NVIDIA GPU Architectures

The CULA is a next generation linear algebra package that uses the GPU as a co-processor to achieve speedups over existing linear algebra packages. CULA supports matrix inversion operation which helps in solving and factorizing the linear algebra matrices. The performance and actual speed ups of CULA depends heavily on the algorithm and the size of the data set. Additionally, the performance also varies with the GPU memory available for performing the computation, which varies with different flavors of NVIDIA GPU cards. This feature can be potentially explored by using the device interface model of CULA. So, the performance analysis in terms of GFLOPS can be done on Fermi, Tesla as well as Kepler Architectures. The performance analysis will involve running different applications on CULA dense R17. This study is important as it will reflect the advantages of using a particular architecture for getting optimized performance for Spike GPU solver.Prateek Gupta, Dan Negrut

### Performance Comparison Study between Nvidia Fermi and Kepler Architecture

A comparative study between Nvidia Fermi and Kepler architectures are being undertaken. The key aspects being targeted are performance scaling for computational kernels like tiled matrix multiplication, memory transfer behavior, gains using streaming and performance difference observed when using THRUST library. Variation of execution configuration, working data set and higher occupancy on individual architectures would be exercised.Contributors: Arindam Sinha, Dan Negrut

### Selective Laser Sintering Simulation Using Chrono::Engine

This project presents an effort to use physics based simulation techniques to model the Selective Laser Sintering (SLS) layering process. SLS is an additive manufacturing process that melts thin layers of extremely fine powder; we use powder with an average diameter of 58 microns. In the numerical model, each powder particle is a discrete object with 632,000 objects used for the SLS layering simulation. We first performed an experiment to measure the angle of repose for the polyamide 12 (PA 650) powder used in the SLS process. This measurement was used to determine the correct friction parameters and calibrate the numerical model. Once calibrated, initial simulations for the SLS layering process were performed to measure the changes in the surface profile of the powder. Future work will study the effect that different powders and roller speeds have on the surface roughness of a newly deposited powder layer along with determining the changes to density and porosity in the final part.##### Videos:

SLS Layering VideoSLS Angle Of Repose Video

Contributors: Hammad Mazhar, Endrina Forti, Jonas Bollmann

Prof. Tim Osswald, Prof. Dan Negrut

### PD-IP - An Numerical Method for Computing Frictional Contact Forces of Large Multibody System

A Primal-Dual Interior Point (PD-IP) method is investigated for solving the Cone Complementarity Problem (CCP) associated with friction and contact of many body dynamics formulated by the framework of differential variational inequality (DVI). As a second order method, it exhibits significantly faster convergence rate than traditional first order method, such as Gauss-Siedel, Jacobi or Accelerated Projected Gradient Descent (APGD), and calls for a smaller number of iterations. Essentially, this method finds the optimal solution by solving a sequence of large linear systems of equations. The ongoing work focus on solving this series of equations efficiently, either using an iterative solver, or a direct one.Contributors: Luning Fang, Dan Negrut

### Robot Walking on Granular Terrain

The simulation consists of a six-legged robot which walks over granular terrain. This project begins to model the experiment presented in A Terradynamics of Legged Locomotion on Granular Media by Li, Zhang, and Goldman. Currently, the terrain is composed of 10,000 spheres.Francisco Mercado

### Cross-sectional Pattern in Mixing of Granular Material

This simulation is modeled after Spontaneous chaotic granular mixing by Shinbrot, Alexander, and Muzzio. A mixing barrel is filled with sand-sized particles with different colored particles upstream and downstream. The fractal pattern that emerges in the cross-section is observed at each quarter turn of the barrel in the article but can be observed throughout our simulation.Francisco Mercado

### A Multibody Dynamics-Enabled Mobility Analysis Tool for Military Applications

This project demonstrates a modeling, simulation, and visualization framework aimed at enabling physics-based analysis of ground vehicle mobility. This framework, called Chrono, has been built to leverage parallel computing both on distributed and shared memory architectures. Chrono is both modular and extensible. Modularity stems from the design decision to build vertical applications whose goal is to reduce the end-to-end time from vision-to-model-to-solution-to-visualization for a targeted application field. The extensibility is a consequence of the design of the foundation modules, which can be enhanced with new features that benefit all the vertical applications. Two factors motivated the development of Chrono. First, there is a manifest need of modeling approaches and simulation tools to support mobility analysis on deformable terrain. Second, the hardware available today has improved to a point where the amount of sheer computer power, the memory size, and the available software stack (productivity tools and programming languages) support computing on a scale that allows integrating highly accurate vehicle dynamics and physics-based terramechanics models. Although commercial software is available nowadays for simulating vehicle and tire models that operate on paved roads; deformable terrain models that complement the fidelity of present day vehicle and tire models have been lacking due to the complexity of soil behavior. This project demonstrates Chrono's ability to handle these difficult mobility situations through several simulations, including: (i) urban operations, (ii) muddy terrain operations, (iii) gravel slope operations, and (iv) river fording.Daniel Melanz, Hammad Mazhar, Dan Negrut

### Chrono::Render A Purpose Rendering Capability of Large-Scale Simulation Data

As simulations grow in complexity the data extracted from the model grows in size. For engineers and scientists, it is difficult and tedious to gain meaningful insights for large data sets; hence visualization becomes critical to computer simulation since it provides a more natural means to extract the salient information of abstruse data. Additionally, visualization makes it easier to share and communicate the content of a simulation leading to wider interest and understanding of its results. Therefore, we have been developing a rendering pipeline called Chrono::Render which provides a simple means to efficiently create high quality renderings of arbitrary data. The pipeline uses the open source Blender modeling software as the front end via a plugin. The data is then passed to the Simulation Based Engineering Lab's Euler server via a web interface to render with Pixar's PhotoRealistic RenderMan (PRMan) or an open source alternative such as Aqsis or Pixie.Contributors: Daniel Kaczmarek, Aaron Bartholomew

### Terramechanics Methods for Real-time Off-Road Vehicle Mobility Simulation on Deformable Terrain

By extending semi-analytical Terramechanics methods for general three-dimensional tire and terrain geometries and combining it with a deformable compaction-based terrain model, general purpose tire/terrain mobility scenarios can be simulated. A vertical application was then created with this framework that combines a multibody vehicle in CHRONO::Rigid with the physics-based, 3-D deformable terrain database of CHRONO::Terrain. Using representative suspension hardpoints, spring/damper rates and accurate mass/inertia information, a representative HMMWV vehicle model was developed. Contact patch force models were developed by extending semi-analytical terramechanics approaches to the general, 3-D case. Leveraging High Performance Computing in the form of parallel CPUs and GPUs enables real-time vehicle mobilty to be realized, which enables operator-in-the-loop simulations.Contributors: Justin Madsen, Andrew Seidl, Dan Negrut - UW Madison

Prof. Paul Ayers, University of Tennessee-Knoxville

### Compaction-Based Terrain Model for Soft Soil Off-Road Vehicle Mobility Simulations

In an effort to support general 3-D vehicle mobility on non-flat terrain, CHRONO::Terrain is a deformable terrain database system that allows for the terrain surface to be described on both macro- and micro-scale resolutions. Inspired by previous work that used a combination of global low-resolution surface elements with localized high-frequency B-Splines to add "bumpiness", it is possible to capture slopes, hills and walls, as well as give the driver the appearance of bumpy, non-flat off-road terrain. A soil-compression model tracks the 3-D stress/strain due to vehicle loads, and the terrain surface deforms according to a visco-elastic-plastic approach that considers effects of generalized 3-D tire and terrain geometries.Contributors: Justin Madsen, Andrew Seidl, Dan Negrut - UW Madison

Prof. Paul Ayers, George Bozdech, University of Tennessee-Knoxville

Jeff Freeman, Ford Cook-MechSim Inc.

### Implementation of an Index-3 Differential-Algebraic Equation Solver on Parallel Architecture

The Absolute Nodal Coordinate Formulation (ANCF) has been widely used to carry out the dynamics analysis of flexible bodies that undergo large rotation and large deformation. This formulation is consistent with the nonlinear theory of continuum mechanics and is computationally more efficient compared to other nonlinear finite element formulations. Kinematic constraints that represent mechanical joints and specified motion trajectories can be introduced to make complex flexible mechanisms. As the complexity of a mechanism increases, the system of differential algebraic equations becomes very large and results in a computational bottleneck. This project helps alleviate this bottleneck using three tools: (1) an implicit time-stepping algorithm, (2) fine-grained parallel processing on the Graphics Processing Unit (GPU), and (3) enabling parallelism through a novel Constraint-Based Mesh (CBM) approach. The combination of these tools results in a fast solution process that scales linearly for large numbers of elements, allowing meaningful engineering problems to be solved.Daniel Melanz, Radu Serban, Ang Li, Dan Negrut

## Past Projects:

### Power Performance Scaling Analysis of Computational Kernels using CUDA

A power performance scaling analysis for matrix multiplication, matrix transpose and fast Fourier transform CUDA kernels using different optimization techniques was undertaken to correlate the overall power usage. The effect of varying execution configurations and working set sizes on NVIDIA K20X device was captured using NVML API provided by NVIDIA. In due course of the project, a kernel independent, pluggable code structure was designed by compute the power concurrently with kernel execution using OpenMP. Kernel optimizations observed for power behaviour included tiling using shared memory, memory bank conflict free access, pinned memory on host, variation in granularity and special cases like FFT using reduction operation. The result exposed a favourable working set size for each kernel, which completes the computation in most power efficient way amongst the implemented kernel variations and optimizations.Contributors: Arindam Sinha, Prateek Gupta

### Metronome Synchronization

Metronomes tuned to the same frequency but initialized out of sync can self-synchronize if they are placed on a common base that is free to translate. Video recordings of this phenomenon can be found in many places on the internet. This interactive simulation allows the user to start each metronome individually and then release the base to allow synchronization of the coupled oscillators. Based on the amount of damping in the metronomes the level of synchronization will vary. With a small amount of damping, the possibility of symmetric synchronization is high; this is when n metronomes have uniformly distributed phase shifts of lambda/n. As damping increases the likelihood of complete synchronization (all metronomes ticking in unison) increases.Francisco Mercado

### Simulation and Validation of Particle Suspension Using Chrono::Fluid

We employ a Lagrangian-Lagrangian (LL) numerical formalism to study two- and three-dimensional (2D, 3D) pipe flow of dilute suspensions of macroscopic neutrally buoyant rigid bodies at flow regimes with Reynolds numbers (Re) between 0.1 and 1400. A validation study of particle migration over a wide spectrum of Re and average volumetric concentrations demonstrates the good predictive attributes of the LL approach adopted herein. Using a scalable parallel implementation of the approach, 3D direct numerical simulation is used to show that (1) rigid body rotation affects the behavior of a particle laden flow; (2) an increase in neutrally buoyant particle size decreases radial migration; (3) a decrease in inter-particle distance slows down the migration and shifts the stable position further away from the channel axis; (4) rigid body shape influences the stable radial distribution of particles; (5) particle migration is influenced, both quantitatively and qualitatively, by the Reynolds number; and (6) the stable radial particle concentration distribution is affected by the initial concentration. The parallel LL simulation framework developed herein does not impose restrictions on the shape or size of the rigid bodies and was used to simulate 3D flows of dense, colloidal suspensions of up to 30,000 neutrally buoyant ellipsoids.##### Videos:

Rigid Body Suspension in Channel FlowContributors: Arman Pazouki, Dan Negrut

### A Parallel GPU Implementation of the Absolute Nodal Coordinate Formulation

With a Frictional/Contact Model for the Simulation of Large Flexible Body Systems This contribution discusses how a flexible body formalism, specifically, the Absolute Nodal Coordinate Formulation (ANCF), is combined with a frictional/contact model using a continuous contact force model to address many-body dynamics problems; i.e., problems with hundreds of thousands of rigid and deformable bodies. Since the computational effort associated with these problems is significant, the analytical framework is implemented to leverage the computational power available on today’s commodity Graphical Processing Unit (GPU) cards. The code developed is validated against ANSYS and FEAP results. The resulting simulation capability is demonstrated in conjunction with hair simulation.Contributors: Naresh Khude, Daniel Melanz, Dan Negrut