Thursday, April 10, 2025

Accelerating Drug Discovery in AutoDock-GPU with Tensor Cores

🧪 What is AutoDock-GPU?

AutoDock-GPU is a GPU-accelerated version of AutoDock, one of the most widely used molecular docking programs. Molecular docking is a computational method used to predict how a small molecule (like a drug candidate) binds to a target protein.

AutoDock-GPU speeds up the process by parallelizing computations, allowing thousands of ligand conformations to be tested rapidly. It's vital for virtual screening, where millions of compounds may be docked in silico to find the most promising drug leads.

⚙️ What Was the Bottleneck?

One of the core operations in AutoDock-GPU is computing the scoring function, which estimates how well a ligand binds to a receptor. This involves many mathematical reductions (summations across arrays/vectors of energy terms).

In the original implementation, these reduction operations were done using basic GPU operations.
These were not fully optimized for newer GPU architectures, particularly NVIDIA’s Tensor Cores, which are capable of performing fused matrix-multiply-add (MMA) operations with extreme speed.

So while AutoDock-GPU was fast, its scoring function reductions were a weak link, especially given the rise of more powerful GPUs with tensor computation capabilities.

🚀 What Did the Authors Do?

The researchers, Gabin Schieffer and Ivy Peng, introduced a new way to perform sum reduction on 4-element float vectors by translating it into a matrix multiplication task that Tensor Cores can execute extremely quickly.

Key Innovations:

Reformulated the reduction as a form of matrix operation compatible with NVIDIA’s Tensor Core acceleration hardware.
Integrated this optimized reduction back into the AutoDock-GPU codebase.

This is clever because Tensor Cores are typically used for deep learning operations (e.g., matrix-heavy tasks in neural networks). Using them to accelerate classical computational chemistry workflows is innovative and non-trivial.

📈 What Were the Results?

The researchers tested the modified AutoDock-GPU with this new reduction method on various chemical complexes across three GPU models:

Performance of the reduction operation improved by a factor of 4× to 7×.
Overall docking time improved by 27% on average, which is substantial given that docking is a core loop in virtual screening.

This optimization makes the whole drug discovery pipeline significantly faster, especially when screening thousands to millions of compounds.

🧠 Why Does This Matter?

Faster Drug Discovery: Time is critical in drug development (think of pandemic response). A 27% speed-up can reduce months of computation to weeks.
Efficient GPU Utilization: Maximizing the use of GPU capabilities (like Tensor Cores) means you get more performance without additional hardware investment.
Cross-disciplinary Innovation: This work is a beautiful example of cross-pollination between AI hardware and computational chemistry, pushing the limits of both.

🧾 Summary

Feature	Description
Problem	AutoDock-GPU's scoring function reduction was not optimized for modern GPU hardware
Solution	Reformulate 4-element vector reductions using Tensor Core-friendly matrix operations
Technology	Used NVIDIA Tensor Cores (originally designed for AI) to accelerate docking
Results	4–7× speedup on reduction, 27% overall docking time improvement
Impact	Faster and more efficient virtual screening in drug discovery workflows

Perugia, April 10th 2025

Wednesday, April 9, 2025

AMD Ryzen vs. Intel Core: A 2025 Comparison

The battle between AMD and Intel continues to shape the landscape of consumer and professional computing. With both companies releasing competitive processors in recent years, the choice between AMD Ryzen and Intel Core CPUs is more nuanced than ever. Let’s break down the key differences and performance factors as of 2025.

1. Architecture and Manufacturing Process

✅ AMD Ryzen (Zen 4 & Zen 5)

Zen 4 and Zen 5 architectures use TSMC’s advanced 5nm and 4nm nodes.
AMD continues to lead in multi-core efficiency and power consumption.
The chiplet design allows AMD to scale performance well across product lines (Ryzen 5 to Ryzen 9 and Threadripper).

✅ Intel Core (13th and 14th Gen, aka Raptor Lake & Meteor Lake)

Intel’s 13th Gen (Raptor Lake) and 14th Gen (Meteor Lake) use a hybrid architecture with Performance (P) and Efficiency (E) cores.
Intel is transitioning to Intel 4 and Intel 3 nodes (7nm-class), improving efficiency and integrated GPU power.
Integrated Foveros 3D stacking in Meteor Lake improves on-chip communication and modularity.

🆚 Verdict: AMD leads in node maturity and thermal efficiency, while Intel pushes boundaries with hybrid and 3D chip designs.

2. Performance Benchmarks

🧪 Gaming

Intel Core i9-14900K remains the king of high-FPS gaming, especially in titles optimized for high clock speeds and fewer threads.
Ryzen 7 7800X3D is the gaming darling for eSports and AAA titles thanks to its massive L3 cache via 3D V-Cache.

🧪 Productivity & Multithreading

AMD's Ryzen 9 7950X and Threadripper CPUs dominate in content creation, video rendering, and multithreaded tasks.
Intel's chips hold their ground with higher clock speeds, making them great for single-threaded workloads and certain DAW/audio tasks.

🆚 Verdict: AMD wins in productivity-heavy and multithreaded environments, while Intel still shines in raw gaming and single-core scenarios.

3. Power Efficiency and Thermals

AMD Ryzen 7000 and 8000 series CPUs show excellent performance-per-watt, often requiring less cooling and drawing less power under load.
Intel’s 13th/14th Gen CPUs are more power-hungry, especially under full load, which can lead to higher thermal output and the need for beefier cooling solutions.

🆚 Verdict: AMD offers better efficiency and cooler operation, making them ideal for compact or silent builds.

4. Platform and Future-Proofing

AMD (AM5 Platform)

The AM5 socket supports DDR5 and PCIe 5.0, and AMD has committed to supporting AM5 until at least 2026.
Great for future upgrades without replacing your motherboard.

Intel (LGA 1700 & 1851)

Intel’s LGA 1700 ends with 14th Gen; Arrow Lake (15th Gen) will move to LGA 1851, meaning a platform switch is required.
Intel is faster with new features, but less stable in long-term socket compatibility.

🆚 Verdict: AMD wins in long-term upgradeability; Intel offers cutting-edge features at the cost of platform churn.

5. Integrated Graphics and AI Capabilities

Intel’s Meteor Lake CPUs include powerful Arc iGPUs and neural processing units (NPUs) optimized for AI tasks and video processing.
AMD’s Ryzen 8000 APUs with RDNA 3 iGPUs also bring solid integrated graphics, with AI capabilities expanding in the Ryzen AI series.

🆚 Verdict: Intel takes the edge in AI workloads and iGPU performance, but AMD is closing the gap.

6. Price-to-Performance

AMD often offers better value at the mid-range (Ryzen 5 and 7), especially for multitasking and light gaming builds.
Intel still aggressively prices its chips, especially in entry-level Core i5 models, which perform well for budget-conscious gamers.

🆚 Verdict: AMD leads in overall value and efficiency; Intel counters with aggressive pricing and high-end gaming chops.

Conclusion: Which Should You Choose?

Use Case	Recommended CPU Family
High-End Gaming	Intel Core i7/i9 (14th Gen)
Content Creation / Productivity	AMD Ryzen 9 / Threadripper
Budget Builds	AMD Ryzen 5 or Intel Core i5
Future Upgrade Path	AMD AM5 platform
AI / Multimedia	Intel Meteor Lake (14th Gen)

Ultimately, the best CPU depends on your specific needs—gaming, content creation, power efficiency, or future upgrade paths. As of 2025, AMD remains a dominant force in multithreading and efficiency, while Intel maintains leadership in gaming and AI integration.

Perugia, April 9th, 2025

Saturday, March 22, 2025

AutoDock Vina: A Comprehensive Overview

AutoDock Vina: A Comprehensive Overview

1. Introduction

AutoDock Vina is a widely used molecular docking software designed for predicting the binding affinity and binding poses of small molecules (ligands) with target proteins (receptors). It is an improved version of the original AutoDock software and is known for its enhanced accuracy and significantly faster performance.

AutoDock Vina is particularly popular in the fields of drug discovery, computational chemistry, and structural biology. It is open-source and developed by The Scripps Research Institute.

2. Key Features

High Speed and Accuracy:
- AutoDock Vina is much faster than its predecessor, AutoDock 4, due to its efficient scoring function and optimization algorithms.
- Provides more reliable docking results with better pose prediction.
Simple and Automated Workflow:
- Requires minimal user intervention and is easier to set up than AutoDock.
- Automates many parameter settings, making it user-friendly.
Flexible Ligand and Receptor Docking:
- Supports rigid docking (fixed receptor, flexible ligand) and flexible receptor docking (selected residues flexible).
Multi-Core CPU Support:
- Can utilize multiple processor cores to speed up calculations.
Energy-Based Scoring Function:
- Uses an empirical scoring function to estimate binding affinity in kcal/mol.
Wide Compatibility:
- Compatible with Linux, Windows, and macOS.
- Works well with AutoDockTools (ADT) for input file preparation and visualization.

3. How AutoDock Vina Works

AutoDock Vina performs molecular docking by following these steps:

Protein and Ligand Preparation:
- The receptor (protein) and ligand structures are prepared in PDBQT format using AutoDockTools (ADT).
- The receptor is usually kept rigid, while the ligand is assigned rotatable bonds.
Defining the Search Space (Grid Box):
- A search box is defined around the active site of the receptor, specifying where the ligand can explore binding conformations.
Docking Process:
- The software uses an iterative global-local optimization algorithm to generate ligand conformations and predict binding poses.
- The binding energy of each pose is calculated using its scoring function.
Result Analysis:
- The best docking pose (lowest binding energy) is selected.
- Users analyze the output PDBQT files using visualization tools like PyMOL, Chimera, or Discovery Studio.

4. Applications

Drug Discovery:
- Identifying lead compounds by screening molecular libraries.
- Predicting drug-receptor interactions.
Enzyme Inhibitor Design:
- Modeling how small molecules inhibit enzymes by binding to active sites.
Protein-Ligand Interaction Studies:
- Understanding molecular interactions to aid in rational drug design.
Virtual Screening:
- Screening large compound libraries to find potential drug candidates.

5. Advantages & Limitations

✅ Advantages:

Free and open-source.
Faster than AutoDock 4.
User-friendly and requires minimal setup.
Supports parallel computing (multi-threading).
Provides accurate binding energy predictions.

❌ Limitations:

Cannot handle covalent docking directly.
Less flexible receptor handling than more advanced tools like RosettaDock.
Limited to rigid body docking with only selected receptor flexibility.

6. Comparison: AutoDock Vina vs. AutoDock 4

Feature	AutoDock Vina	AutoDock 4
Speed	Faster	Slower
Scoring Function	Empirical	Grid-based
Ease of Use	Easier	More complex
Multi-threading	Yes	No
Flexible Receptor	Limited	More control

7. Getting Started with AutoDock Vina

Installation

Download from the official AutoDock website.
Available for Windows, Linux, and macOS.
Requires Python, OpenBabel, and AutoDockTools for file preparation.

Basic Command Line Usage

vina --receptor protein.pdbqt --ligand ligand.pdbqt --center_x 10 --center_y 20 --center_z 15 --size_x 20 --size_y 20 --size_z 20 --out output.pdbqt

This command specifies:

The receptor and ligand files.
The docking grid center and size.
The output file containing predicted poses.

8. Related Tools for Visualization

PyMOL – View and analyze docked complexes.
Chimera – Advanced molecular visualization.
Discovery Studio – Commercial tool with detailed interaction analysis.

Conclusion

AutoDock Vina is a powerful
, free, and efficient docking tool widely used in computational drug discovery. Its ease of use, speed, and improved scoring function make it a preferred choice over AutoDock 4 for many researchers.

to download it:
https://vina.scripps.edu/

To get a consultancy on your new docking project, please contact me at MPA@pharmakoi.com

Enjoy!!

Mass

Sunday, March 16, 2025

the ASUS TUF Gaming B850-PLUS WIFI motherboard

The Observer Corner:

Today we dive into the ASUS TUF Gaming B850-PLUS WIFI motherboard, one of the best price/performance ration motherboard in my personal opinion.

The ASUS TUF Gaming B850-PLUS WIFI motherboard is an ATX board designed for AMD Ryzen 9000, 8000, and 7000 series processors. It features PCIe 5.0 x16 support, Wi-Fi 7, and Realtek 2.5Gb Ethernet, making it ideal for gaming and high-performance computing.

Key Specifications:

Graphics Outputs: DisplayPort (8K@30Hz) and HDMI 2.1 (4K@60Hz).
Expansion Slots: PCIe 5.0 (x16), PCIe 4.0 (x16, x8/x4 mode), and PCIe 4.0 (x4, x1 slots).
Storage: 3x M.2 slots (PCIe 5.0/4.0) and 4x SATA 6Gb/s ports.
USB Ports:
- Rear I/O: 1x USB-C (20Gbps), 3x USB-A (10Gbps), 4x USB-A (5Gbps), and 2x USB 2.0.
- Front Panel: 1x USB-C (10Gbps), 2x USB 5Gbps, and 4x USB 2.0.
Networking: Wi-Fi 7 (up to 2.9Gbps) and Bluetooth 5.4.
Audio: Realtek ALC1220P 7.1 Surround Sound with premium audio components.
Cooling & Power: 4+ chassis fan headers, 1x AIO pump header, 2x 8-pin CPU power connectors.

This motherboard includes ASUS TUF PROTECTION, Q-Design features for easy installation, and Aura Sync RGB headers for customization.

Take a look at the link below for more details:
https://dlcdnets.asus.com/pub/ASUS/mb/SocketAM5/TUF_GAMING_B850-PLUS_WIFI/E25809_TUF_GAMING_B850-PLUS_WIFI_UM_V2_WEB.pdf?model=TUF%20GAMING%20B850-PLUS%20WIFI

Enjoy!!

Massimiliano
Perugia, March 15th, 2025

Latest trends in GPU technology

Perugia - March 9th, 2025

The latest trends in GPU technology for fluid simulation highlight significant advancements in performance, scalability, and cost efficiency.

GPU Acceleration in Computational Fluid Dynamics (CFD)
GPUs are now an essential tool in CFD, drastically reducing simulation times. Tasks that once took an entire day on CPU servers can now be completed in just over an hour using multiple high-performance GPUs. This acceleration benefits industries such as aerospace, automotive, and pharmaceuticals, where fluid dynamics simulations play a critical role in research and development.

Scalability and Multi-GPU Configurations
Multi-GPU setups are becoming more prevalent, offering improved computational power and efficiency. FluidX3D, for example, has demonstrated a system combining Intel and NVIDIA GPUs to maximize performance while keeping costs lower than high-end single-GPU solutions. The ability to integrate GPUs from different vendors allows for more flexible and cost-effective simulation environments.

Optimized GPU Selection for Specific Workloads
Choosing the right GPU depends on the simulation requirements. Consumer-grade GPUs like the RTX 4090 are excellent for single-precision workloads, providing high performance at a lower cost. On the other hand, enterprise GPUs such as the NVIDIA H100 and A100 excel in handling double-precision and memory-intensive tasks, making them more suitable for large-scale and highly detailed simulations.

Cloud and Hybrid Deployments
Many CFD software providers, including industry leaders like Ansys and Siemens, are optimizing their tools for GPU acceleration in both on-premise and cloud-based environments. Cloud solutions powered by high-performance GPUs enable scalable, on-demand simulations, reducing infrastructure costs and increasing accessibility for researchers and engineers.

Expansion of Competition in High-Performance CFD
AMD is making strides in the high-performance computing space with its Instinct MI300X GPU, which is specifically designed to handle computationally heavy simulations. This competition provides more options for researchers and engineers, challenging NVIDIA’s dominance in the field and fostering further innovation.

Overall, GPUs are transforming fluid simulation by making it faster, more efficient, and more scalable. With continued advancements in hardware and software optimization, the future of CFD looks increasingly driven by high-performance GPU computing.

Interested to a custom-built workstation?
Send out your inquiry to MPA@pharmakoi.com indicating the overall performances you are looking for (TFlops, etc...) and you will get a free quote of a proposed configuration.

Saturday, January 4, 2025

why Nvidia A40 GPUs are so popular?

The NVIDIA A40 GPU is popular for its versatility and high performance across various computational workloads. Here’s why it stands out:

1. Designed for Versatile Use

The NVIDIA A40 is built to handle diverse workloads, including:

AI and Machine Learning: Its architecture supports AI training and inference with high precision.
Graphics Rendering: Offers exceptional rendering capabilities for virtual environments and 3D applications.
High-Performance Computing (HPC): Optimized for computational tasks like simulations, scientific research, and cryptocurrency mining.

This flexibility makes the A40 appealing across industries, from AI research to creative design and enterprise workloads.

2. Ampere Architecture

The A40 is based on NVIDIA's Ampere architecture, which includes:

CUDA Cores: A significant number of CUDA cores (10,752) to accelerate parallel processing tasks.
RT Cores and Tensor Cores: Enhancements for ray tracing and AI-specific operations.
Memory Bandwidth: Equipped with 48GB of GDDR6 memory and a bandwidth of 696 GB/s, making it ideal for memory-intensive applications.

These architectural advancements provide a significant performance boost over previous generations, contributing to its popularity.

3. Excellent Performance-to-Cost Ratio

Compared to flagship GPUs like the NVIDIA A100, the A40 provides excellent computational and rendering performance at a relatively lower price point. This balance between performance and cost makes it attractive for enterprises looking for powerful solutions without overspending.

4. Enterprise and Data Center Optimizations

Passive Cooling Design: Designed for data center environments, the A40 has a passive cooling mechanism, making it ideal for server racks.
Virtualization: Supports NVIDIA’s virtual GPU (vGPU) technology, enabling use cases in virtual desktops and high-performance rendering in remote environments.

5. Popular in Cryptocurrency Mining

The A40 has gained popularity among cryptocurrency miners due to its:

High Hash Rates: Especially for memory-intensive algorithms like Ethereum before the shift to proof-of-stake.
Energy Efficiency: Provides a good balance of performance per watt, which is critical for mining profitability.

6. Preferred for AI and HPC

AI Training: Its Tensor Cores enable efficient processing of AI workloads, while its large memory capacity supports large models and datasets.
Inference: With mixed-precision capabilities, it can handle real-time AI inference tasks effectively.
HPC Applications: Its ability to process complex scientific computations makes it a favored choice in research and enterprise HPC environments.

7. Industry Adoption and Ecosystem

Widely supported in major deep learning and HPC frameworks like TensorFlow, PyTorch, and MATLAB.
Integrated into cloud services and enterprise solutions, making it accessible to a broader range of users.

The NVIDIA A40 GPU’s combination of advanced architecture, diverse use cases, and a competitive performance-to-cost ratio makes it a popular choice across sectors like AI, HPC, graphics rendering, and cryptocurrency mining.

Thursday, January 2, 2025

a detailed technical comparison of Ubuntu and CentOS, focusing on aspects relevant to computational tasks and industrial use cases

1. Base and Philosophy

Ubuntu:
- Base: Debian-based.
- Philosophy: Prioritizes usability, regular updates, and a large ecosystem. Ideal for both desktop and server environments.
- Target Users: Developers, researchers, and users looking for a balance of cutting-edge and stability.
CentOS:
- Base: Historically based on Red Hat Enterprise Linux (RHEL). After CentOS Stream's introduction, it now serves as RHEL's upstream.
- Philosophy: Stability and predictability. Ideal for enterprise environments needing long-term support and tested packages.
- Target Users: Enterprises requiring rock-solid stability and HPC clusters.

2. Package Management

Ubuntu:
- Package Manager: APT (Advanced Package Tool), which uses .deb packages.
- Repositories: Includes Main, Universe, Restricted, and Multiverse repositories, offering a large selection of pre-built software.
- Advantages:
  - Faster updates and access to newer software versions.
  - Strong focus on compatibility with modern software (e.g., Python, machine learning libraries).
CentOS:
- Package Manager: YUM or DNF (on newer versions), which uses .rpm packages.
- Repositories: Limited compared to Ubuntu by default, but extended using EPEL (Extra Packages for Enterprise Linux) and third-party repos.
- Advantages:
  - Highly stable, enterprise-ready software versions.
  - Better suited for systems requiring strict version control (e.g., older Python or GCC for compatibility).

3. Release Cycle and Updates

Ubuntu:
- Releases: Two versions:
  - LTS (Long-Term Support): Released every two years, supported for 5 years (e.g., 20.04, 22.04).
  - Non-LTS: Released every six months, supported for 9 months.
- Update Frequency: Frequent updates with newer features, kernels, and software versions.
- Best Use: Projects needing cutting-edge software and hardware support.
CentOS:
- Releases:
  - CentOS Stream: Continuous updates as the upstream development version of RHEL.
  - CentOS 7/8 Legacy: Provided stability-focused updates, now largely replaced by CentOS Stream, AlmaLinux, or Rocky Linux.
- Update Frequency: Slower and more deliberate updates focused on stability.
- Best Use: Environments requiring long-term stability with minimal changes.

4. System Performance

Ubuntu:
- Kernel: Ships with relatively new kernels in both LTS and non-LTS versions, allowing better hardware compatibility.
- Performance: Optimized for modern workloads but may introduce slight instability due to newer software versions.
- System Overhead: Lightweight flavors like Ubuntu Server or Ubuntu Minimal reduce overhead.
CentOS:
- Kernel: Uses older, more stable kernel versions optimized for enterprise use. Hardware enablement may require backporting.
- Performance: Focuses on consistency and low overhead in enterprise settings.
- System Overhead: Minimal by design; better for high-load and mission-critical tasks.

5. Community and Enterprise Support

Ubuntu:
- Community Support: Large and active community with extensive online documentation.
- Enterprise Support: Canonical offers enterprise support for Ubuntu (e.g., Ubuntu Advantage).
- Ecosystem: Widely used in machine learning, AI, and cloud environments like AWS and Azure.
CentOS:
- Community Support: Smaller community compared to Ubuntu but still active in enterprise and HPC environments.
- Enterprise Support: None directly for CentOS; instead, enterprises turn to RHEL, AlmaLinux, or Rocky Linux for support.
- Ecosystem: Favored in HPC, scientific computing, and traditional enterprise environments.

6. Software Availability

Ubuntu:
- Default Software: Supports a broader range of newer packages.
- Compatibility: Better suited for modern languages, libraries, and frameworks (e.g., TensorFlow, Docker).
- Cloud Integration: Leading choice for cloud-native technologies like Kubernetes and containerized applications.
CentOS:
- Default Software: Ships with older, highly stable versions.
- Compatibility: Ideal for legacy applications or systems requiring specific older software versions.
- Cloud Integration: Supported but less prominent compared to Ubuntu.

7. HPC and Computational Workloads

Ubuntu:
- Preferred for machine learning, AI, and development environments due to cutting-edge tools and frameworks.
- Easier installation of GPU drivers (e.g., NVIDIA) and frameworks like TensorFlow or PyTorch.
CentOS:
- Strong presence in HPC clusters and scientific computing.
- Compatible with software requiring specific older libraries or system configurations.

8. Security and Compliance

Ubuntu:
- Regular security updates.
- Canonical provides enterprise-grade security solutions, including FIPS compliance.
- Snap packages can introduce security concerns due to permissions model.
CentOS:
- Stability-focused updates reduce the risk of security issues from newer software.
- SELinux (Security-Enhanced Linux) is enabled by default, offering robust system security.

When to Use Ubuntu vs. CentOS

Feature	Ubuntu	CentOS
Modern Workloads	Best for machine learning, AI, and cloud.	Ideal for legacy or enterprise workloads.
Stability	Moderate (LTS preferred).	High (CentOS Stream or AlmaLinux).
Cutting-Edge Software	Excellent.	Limited; slower updates.
Long-Term Support	5 years (LTS).	Enterprise-grade with RHEL.
Ease of Use	Easier for beginners.	Better for experienced admins.