Found 46 bookmarks
Newest
Enhancing Portfolio Diversification with Link Prediction: A Graph Data Science Approach - Neo4j Industry Use Cases
Enhancing Portfolio Diversification with Link Prediction: A Graph Data Science Approach - Neo4j Industry Use Cases

🚀 Rethinking Portfolio Diversification with Graph Data Science

Traditional correlation matrices only tell us where markets have been—not where they're going. In today’s hyper-connected financial landscape, that’s not enough.

In the latest work by Nuno Pedro L., we use Neo4j Graph Data Science to model equities as a dynamic network and apply Link Prediction to anticipate future relationships between assets. Instead of reacting to correlations after they form, we can predict them—uncovering hidden risks, emerging clusters, and new opportunities for statistical arbitrage before they appear in traditional models.

🔍 Why it matters:

  • Captures non-linear, evolving market structures
  • Reveals early signals of contagion or co-movement
  • Supports smarter diversification and proactive risk management

If you’re exploring the future of quantitative finance, network analytics, or portfolio intelligence, this approach is a game-changer.

📈 Graph data science isn’t just descriptive—it’s predictive.

·neo4j.com·
Enhancing Portfolio Diversification with Link Prediction: A Graph Data Science Approach - Neo4j Industry Use Cases
Graph Embeddings at scale with Spark and GraphFrames
Graph Embeddings at scale with Spark and GraphFrames

One of my biggest contributions to the GraphFrames project is scalable graph embeddings. While not perfect, my implementation is inexpensive to compute and horizontally scalable. It uses a combination of random walks and Hash2Vec, an algorithm based on random projection theory.

In the post, I provide the full code and an explanation of all the engineering decisions I made. For example, I explain why I used Reservoir Sampling for neighbor aggregation or Map Partitions instead of the DataFrame API.

The pull request (PR) has not been merged yet, so if you have any ideas on how to improve the approach, I would love to hear them! Overall, it appears to be a good, inexpensive way to create scalable embeddings of graph vertices that can easily be incorporated into existing classification or recommender system pipelines. Finally, GraphFrames will have real capabilities for graph data science! At least, I hope so. :)

·semyonsinchenko.github.io·
Graph Embeddings at scale with Spark and GraphFrames
Cosmograph graph visualization tool
Cosmograph graph visualization tool
Huge news for Cosmograph 🪐 While everyone was on Thanksgiving break, I was polishing up the next big Cosmograph update, which I'm finally ready to share! More than three years after the initial release, Cosmograph remains the only single-node web-based tool capable of visualizing graphs with 1 million points and way more than a million links due to its unique GPU Force Layout and Rendering engine cosmos.gl. However, it also had a few major weaknesses like poor memory management and limited analytical capabilities. Version 2.0 of Cosmograph solves these problems by incorporating: - DuckDB (the best in-memory analytics database); - Mosaic (the fastest cross-filtering and visual analytics framework for the web); - SQLRooms (an open-source React toolkit for human and agent collaborative analytics apps by Ilya Boyandin) as its foundation; - The latest version of cosmos.gl (our core force simulation and rendering engine that recently joined OpenJS) to give you even faster performance, more forces, and the long-awaited point-dragging functionality! What does this mean in practice? - Work with larger datasets and use SQL (thanks to WebAssembly and DuckDB); - Much better performance (filtering, timeline, changing visual properties of the graph, etc.); - Open Parquet files natively; - Save your graphs to the cloud and share them with the world easily. And if you work with ML embeddings and love Apple's Embedding Atlas (https://lnkd.in/gsWt6CNT), you'll love Cosmograph too since they have a lot in common. If all the above has excited you, go check out Cosmograph's new beautiful website, and share the news with the world 🙏 https://cosmograph.app | 41 comments on LinkedIn
Cosmograph
·linkedin.com·
Cosmograph graph visualization tool
OSMnx is a Python package that downloads any city’s street network, buildings, bike lanes, rail, or walkable paths from OpenStreetMap and instantly turns them into clean, routable NetworkX graphs with correct topology, projected coordinates, edge lengths, bearings, and travel speeds.
OSMnx is a Python package that downloads any city’s street network, buildings, bike lanes, rail, or walkable paths from OpenStreetMap and instantly turns them into clean, routable NetworkX graphs with correct topology, projected coordinates, edge lengths, bearings, and travel speeds.
OSMnx is a Python package that downloads any city’s street network, buildings, bike lanes, rail, or walkable paths from OpenStreetMap and instantly turns them into clean, routable NetworkX graphs with correct topology, projected coordinates, edge lengths, bearings, and travel speeds.
OSMnx is a Python package that downloads any city’s street network, buildings, bike lanes, rail, or walkable paths from OpenStreetMap and instantly turns them into clean, routable NetworkX graphs with correct topology, projected coordinates, edge lengths, bearings, and travel speeds.
·linkedin.com·
OSMnx is a Python package that downloads any city’s street network, buildings, bike lanes, rail, or walkable paths from OpenStreetMap and instantly turns them into clean, routable NetworkX graphs with correct topology, projected coordinates, edge lengths, bearings, and travel speeds.
What Is a Security Graph? Understanding the Foundation of Modern Cybersecurity | LinkedIn
What Is a Security Graph? Understanding the Foundation of Modern Cybersecurity | LinkedIn
LinkedIn Post | Shawn Bice A core component of our AI-first, end-to-end security platform that we announced recently is the Microsoft Sentinel graph. The term ‘graph’ is used broadly in the security industry, yet it is often misunderstood or used inaccurately.
·linkedin.com·
What Is a Security Graph? Understanding the Foundation of Modern Cybersecurity | LinkedIn
GraphFrames 0.10.0 release
GraphFrames 0.10.0 release
On behalf of the GraphFrames maintainers, I am happy to announce the delivery of a new release. It is a significant improvement! It improves performance and memory management: The new release provides 3-50x faster performance for all algorithms. The x5 performance improvement in Connected Components is especially important, as it allows one to perform graph-based identity resolution much faster with the new GraphFrames. All Pregel-based algorithms, such as Shortest Paths and Label Propagation, received a boost of around 3x. The new release comes with its own internal fork of Apache #Spark GraphX due to its deprecation in upstream Spark. This allows us to improve the performance of GraphX-based Label Propagation by 50x and fix memory leaks. Now, it is usable for graph processing inside Structured Streaming. New algorithms were added: New algorithms for K-core centrality, cycle detection, and maximal independent set were added. All of them are based on advanced scientific papers and operate fully in a distributed manner. New APIs: A new API for computing vertex degrees based on edge types was added. The motifs finding API now supports undirected, bidirectional, and multi-hop patterns. The #PySpark API has all the recent improvements in the Scala Core, so there is feature parity between the core and Python. Documentation improvements: The documentation has been significantly expanded, especially the sections on the arguments and parameters of the algorithms. To simplify the onboarding process for new users, the documentation website now contains an llms.txt file in the root directory. Asking an LLM chatbot or coding assistant about how to use GraphFrames is now more efficient. It is already published in Maven Central and PyPi! Blog-post: https://lnkd.in/dU4kRmSD
·linkedin.com·
GraphFrames 0.10.0 release
Unlock GPU Power with GFQL
Unlock GPU Power with GFQL
Rough news on #kuzu being archived - startups are hard and Semih Salihoğlu + Prashanth Rao did so much in ways I value, and the same architectural principles we've been quietly tackling in GFQL. For those left in the lurch for an embeddable compute-tier solution to graphs, #GFQL should be pretty fascinating yet also familiar (ex: Apache Arrow-native graph queries for modern OSS ecosystems), and hopefully less stress due to a sustainable governance model. Likewise, as an oss deeptech community, we add interesting new bits like the optional record-breaking GPU mode with NVIDIA #RAPIDSAI . If you're new to it and seeing this: #GFQL, the graph dataframe-native query language, is increasingly how Graphistry, Inc. and our community work with graphs at the compute tier. Whether the data comes from a tabular ETL pipeline, a file, SQL, nosql, or a graph storage DB, GFQL makes it easy to do on-the-fly graph transforms and queries at the compute tier at sub-second speeds for graphs anywhere from 100 edges to 1,000,000,000 . Currently, we support arrow/pandas, and arrow / #nvidia #RAPIDS as the main engine modes. While we're not marketing it much yet, GFQL is already used daily by every single Graphistry user behind-the-scenes, and directly by analysts & developers at banks, startups, etc around the world. We built it because we needed an OSS compute-tier graph solution for working with modern data systems that separate storage from compute. Likewise, data is a team sport, so it is used by folks on teams who have to rapidly wrangle graphs, whether for analysis, data science, ETL, visualization, or AI. Imagine an ETL pipeline or notebook flow or web app where data comes from files, elastic search, databricks, and neo4j, and you need to do more on-the-fly graph stuff with it. We started building what became GFQL *before* Kuzu because it solves real architectural & graph productivity problems that have been challenging our team, our users, and the broader graph community for years now. Likewise, by going dataframe-native & GPU-mode from day 1, it's now a large part of how we approach GPU graph deep tech investments throughout our stack, and means it's a sustainably funded system. We are looking at bigger R&D and commercial support contracts with organizations needing to do subsecond billion+-scale with us so we can build even more, faster (hit me up if that's you!), but overall, most of our users are just like ourselves, and the day-to-day is wanting an easy OSS way to wrangle graphs in our apps & notebooks. As we continue to smooth it out (ex: we'll be adding a familiar Cypher syntax), we'll be writing about it a lot more. 4 links below: ReadTheDocs, pip install, SOTA GPU benchmarks, and original aha moment + Russell Jurney Ben Lorica 罗瑞卡 Taurean Dyer Bradley Rees
·linkedin.com·
Unlock GPU Power with GFQL
YouTube channel on graphs has just exceeded 3,000,000 views
YouTube channel on graphs has just exceeded 3,000,000 views
Ma chaine YouTube sur les graphes vient de dépasser les 3.000.000 de vues ! 🎉 🍾 Avec 73 vidéos disponibles 🖥️ , elle aide plus de 25.000 abonnés (et d'autres qui passent par hasard) à se familiariser sur un sujet qui devrait faire partie de la culture générale 📚 de tout ingénieur. Faites principalement à base d'exemples commentés, mes vidéos explorent de nombreux sujets de ce domaine à l'intersection entre les mathématiques discrètes et l'informatique. Les graphes sont "présents" partout, dans tous les systèmes composés d'éléments et de relations entre ces éléments ; ils peuvent aider à les modéliser, à mieux les maitriser et à les exploiter. Cela fait plusieurs mois maintenant que je n'ai plus rien publié sur cette chaine mais chaque jour de nouveaux venus (étudiants principalement mais pas que...) viennent découvrir ces objets simples à décrire mais si difficiles à manipuler efficacement ! Ma chaine n'est pas monétisée, je ne gagne donc pas d'argent avec. Les publicités sont ajoutées par Youtube, à leur seul profit... https://lnkd.in/exfWrPxA| 18 commentaires sur LinkedIn
YouTube channel on graphs has just exceeded 3,000,000 views
·linkedin.com·
YouTube channel on graphs has just exceeded 3,000,000 views
The single most undervalued fact of graph theory: Every board is a graph in disguise
The single most undervalued fact of graph theory: Every board is a graph in disguise
The single most undervalued fact of graph theory: Every board is a graph in disguise. Here’s the 3-step mapping that turns messy “rooms” into clean, countable components. 0/ You’re given a map of walls and floor tiles. By eye, you see there are three rooms. But how do you get a computer to see them too? 1/ Start by modeling the board as a graph. Treat every floor tile as a node. Define valid moves as edges. In our case, moves are the four directions: • Up • Down • Left • Right Walls simply remove edges because you can’t step through them. 2/ Number the floor tiles arbitrarily so you can reference nodes. Now you’ve converted the board to an undirected graph. Why do this? Because two common board questions become standard graph problems. 1. “Shortest path between two tiles?” becomes “shortest path between two nodes.” 2. “How many rooms?” becomes “how many connected components?” That second one is our target. A “room” is just a maximal set of tiles reachable from each other without crossing walls. In graph terms, that’s a connected component. So the count of rooms equals the count of connected components. Here’s the practical recipe I use: • Nodes = all floor tiles. • Edges = pairs of floor tiles one step apart (U/D/L/R). • Walls = missing edges. • Rooms = connected components. • Answer = number of connected components. 3. You can run a DFS or BFS from every unvisited node and mark all reachable tiles. Each fresh start increments the room counter by one. That’s it. No heuristics, no guesswork, just graph structure doing the heavy lifting. Once you see boards as graphs, these problems stop feeling ad hoc. They become repeatable templates you can code in minutes. If this helped, repost so more people learn the “rooms = components” pattern.
The single most undervalued fact of graph theory:Every board is a graph in disguise
·linkedin.com·
The single most undervalued fact of graph theory: Every board is a graph in disguise
Introducing Brahmand: a Graph Database built on top of ClickHouse
Introducing Brahmand: a Graph Database built on top of ClickHouse
Introducing Brahmand: a Graph Database built on top of ClickHouse. Extending ClickHouse with native graph modeling and OpenCypher, merging OLAP speed with graph analysis. While it’s still in early development, it’s been fun writing my own Cypher parser, query planner with logical plan, analyzer, and optimizer in Rust. On the roadmap: native JSON support, bolt protocol, missing Cypher features like WITH, EXISTS, and variable-length relationship matches, along with bitmap-based optimizations and distributed cluster support. Feel free to check out the repo: https://lnkd.in/d-Bhh-qD I’d really appreciate a ⭐ if you find it useful!
Introducing Brahmand: a Graph Database built on top of ClickHouse
·linkedin.com·
Introducing Brahmand: a Graph Database built on top of ClickHouse
GitHub - karam-ajaj/atlas: Open-source tool for network discovery, visualization, and monitoring. Built with Go, FastAPI, and React, supports Docker host scanning.
GitHub - karam-ajaj/atlas: Open-source tool for network discovery, visualization, and monitoring. Built with Go, FastAPI, and React, supports Docker host scanning.
Open-source tool for network discovery, visualization, and monitoring. Built with Go, FastAPI, and React, supports Docker host scanning. - karam-ajaj/atlas
·github.com·
GitHub - karam-ajaj/atlas: Open-source tool for network discovery, visualization, and monitoring. Built with Go, FastAPI, and React, supports Docker host scanning.
Graph Algorithms: A Developer's Guide | LinkedIn
Graph Algorithms: A Developer's Guide | LinkedIn
Graph algorithms address a wide variety of problems by representing entities as nodes and relationships as edges. Want to find the fastest route through a city? Spot influential users on a social media platform? Or detect communities hidden in a sprawling network? Graph algorithms provide the tools
·linkedin.com·
Graph Algorithms: A Developer's Guide | LinkedIn
You don’t need a PhD to choose the right graph representation
You don’t need a PhD to choose the right graph representation
You don’t need a PhD to choose the right graph representation. You just need to know what you’ll do with the graph and pick the structure that makes that fast and easy. Below is a quick, practical guide you can use today: 1. Edge List: Start here when you just need “a list of connections.” An edge list is literally a list of edges: (x, y) for directed graphs, (x, y) and (y, x) for undirected, and (x, y, w) if weighted. It shines when you process edges globally (e.g., sort by weight for Kruskal’s MST). It’s also the most compact when your graph is sparse and you don’t need constant-time lookups. When to use: • You’ll sort/filter edges (MST, connectivity batches) • You’re loading data from CSV/logs where edges arrive as rows • You want minimal structure first and you’ll convert later if needed Trade-offs: • Space: O(m) • Iterate neighbors of x: O(m) (unless you pre-index) • Check if (x, y) exists: O(m) (or O(1) with an auxiliary hash set you maintain) 2. Adjacency Matrix: Use this when instant “is there an edge?” matters. A 2D array G[x][y] stores 1 (or weight) if the edge exists, else 0 (or a sentinel like −1/∞). You get constant-time edge existence checks and very clean math/linear-algebra operations. The cost is space: O(n²) even if the graph is tiny. If memory is fine and you need O(1) membership checks, go matrix. When to use: • Dense graphs (m close to n²) • Fast membership tests dominate your workload • You’ll leverage matrix ops (spectral methods, transitive closure variations, etc.) Trade-offs: • Space: O(n²) • Check if (x, y) exists: O(1) • Iterate neighbors of x: O(n) 3. Adjacency List Choose this when you traverse from nodes to neighbors a lot. Each node stores its exact neighbors, so traversal is proportional to degree, not n. This representation is ideal for BFS/DFS, Dijkstra (with weights), and most real-world sparse graphs. Membership checks are slower than a matrix unless you keep neighbor sets. For sparse graphs and traversal-heavy algorithms, pick adjacency lists. When to use: • Graph is sparse (common in practice) • You’ll run BFS/DFS, shortest paths, topological sorts • You need O(n + m) space and fast neighbor iteration Trade-offs: • Space: O(n + m) • Iterate neighbors of x: O(deg(x)) • Check if (x, y) exists: O(deg(x)) (or O(1) if you store a set per node) ~ Pick the representation that optimizes your next operation, not some abstract ideal. Edge lists for global edge work, matrices for instant membership, lists for fast traversal. Choose deliberately, and your graph code gets both cleaner and faster. If you want a deep dive on graph representation, say “Graphs” in the comments and I’ll send it to you on DM. | 15 comments on LinkedIn
You don’t need a PhD to choose the right graph representation
·linkedin.com·
You don’t need a PhD to choose the right graph representation
Graph training: Graph Tech Demystified
Graph training: Graph Tech Demystified
Calling all data scientists, developers, and managers! 📢 Looking to level up your team's knowledge of graph technology? We're excited to share the recorded 2-part training series, "Graph Tech Demystified" with the amazing Paco Nathan. This is your chance to get up to speed on graph fundamentals: In Part 1: Intro to Graph Technologies, you'll learn: - Core concepts in graph tech. - Common pitfalls and what graph technology won't solve. - Focus of graph analytics and measuring quality. 🎥 Recording https://lnkd.in/gCtCCZH5 📖 Slides https://lnkd.in/gbCnUjQN In Part 2: Advanced Topics in Graph Technologies, we explore: - Sophisticated graph patterns like motifs and probabilistic subgraphs. - Intersection of Graph Neural Networks (GNNs) and Reinforcement Learning. - Multi-agent systems and Graph RAG. 🎥 Recording https://lnkd.in/g_5B8nNC 📖 Slides https://lnkd.in/g6iMbJ_Z Insider tip: The resources alone are enough to keep you busy far longer the time it takes to watch the training!
Graph Tech Demystified
·linkedin.com·
Graph training: Graph Tech Demystified
Hot take on "faster then Dijkstra"
Hot take on "faster then Dijkstra"
𝗛𝗼𝘁 𝘁𝗮𝗸𝗲 𝗼𝗻 𝘁𝗵𝗲 “𝗳𝗮𝘀𝘁𝗲𝗿 𝘁𝗵𝗮𝗻 𝗗𝗶𝗷𝗸𝘀𝘁𝗿𝗮” 𝗵𝗲𝗮𝗱𝗹𝗶𝗻𝗲𝘀: The recent result given in the paper: https://lnkd.in/dQSbqrhD is a breakthrough for theory. It beats Dijkstra’s classic worst-case bound for single-source shortest paths on directed graphs with non-negative weights. That’s big for the research community. 𝗕𝘂𝘁 𝗶𝘁 𝗱𝗼𝗲𝘀𝗻’𝘁 “𝗿𝗲𝘄𝗿𝗶𝘁𝗲” 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗿𝗼𝘂𝘁𝗶𝗻𝗴. In practice, large-scale systems (maps, logistics, ride-hailing) moved past plain Dijkstra years ago. They rely on heavy preprocessing. Contraction Hierarchies, Hub Labels and other methods are used to answer point-to-point queries in milliseconds, even on large, continental networks. 𝗪𝗵𝘆 𝘁𝗵𝗲 𝗱𝗶𝘀𝗰𝗼𝗻𝗻𝗲𝗰𝘁?  • Different goals: The paper targets single-source shortest paths; production prioritizes point-to-point queries at interactive latencies.  • Asymptotics vs. constants: Beating O(m + n log n) matters in principle, but real systems live and die by constants, cache behavior, and integration with traffic/turn costs.  • Preprocessing wins: Once you allow preprocessing, the speedups from hierarchical/labeling methods dwarf Dijkstra and likely any drop-in replacement without preprocessing. We should celebrate the theoretical advance and keep an eye on practical implementations. Just don’t confuse a sorting-barrier result with an immediate upgrade for Google Maps. 𝗕𝗼𝘁𝘁𝗼𝗺 𝗹𝗶𝗻𝗲: Great theory milestone. Production routing already “changed the rules” years ago with preprocessing and smart graph engineering.
𝗛𝗼𝘁 𝘁𝗮𝗸𝗲 𝗼𝗻 𝘁𝗵𝗲 “𝗳𝗮𝘀𝘁𝗲𝗿 𝘁𝗵𝗮𝗻 𝗗𝗶𝗷𝗸𝘀𝘁𝗿𝗮” 𝗵𝗲𝗮𝗱𝗹𝗶𝗻𝗲𝘀
·linkedin.com·
Hot take on "faster then Dijkstra"
Faster than Dijkstra? Tsinghua University’s new shortest path algorithm just rewrite the rules of graph traversal.
Faster than Dijkstra? Tsinghua University’s new shortest path algorithm just rewrite the rules of graph traversal.
🚀 Faster than Dijkstra? Tsinghua University’s new shortest path algorithm just rewrite the rules of graph traversal. For 65+ years, Dijkstra’s algorithm was the gold standard for finding shortest paths in weighted graphs. But now, a team from Tsinghua University has introduced a recursive partial ordering method that outperforms Dijkstra—especially on directed graphs. 🔍 What’s different?  Instead of sorting all vertices by distance (which adds log-time overhead), this new approach uses a clever recursive structure that breaks the O(m + n log n) barrier ✨.  It’s faster, leaner, and already winning awards at STOC 2025 🏆. 📍 Why it matters:  Think Google Maps, Uber routing, disaster evacuation planning, circuit design—any system that relies on real-time pathfinding across massive graphs. Paper ➡ https://lnkd.in/dGTdRj2X #Algorithms #ComputerScience #Engineering #Dijkstra #routing #planning #logistic | 34 comments on LinkedIn
Faster than Dijkstra? Tsinghua University’s new shortest path algorithm just rewrite the rules of graph traversal.
·linkedin.com·
Faster than Dijkstra? Tsinghua University’s new shortest path algorithm just rewrite the rules of graph traversal.
Find the best link prediction for your specific graph
Find the best link prediction for your specific graph
🔗 How's your Link Prediction going? Did you know that the best algorithm for link prediction can vary by network? Slight differences in your graph data, and you may be better off with a new approach. Join us for an exclusive talk on August 28th to learn how to find the right link prediction model and, ultimately, get to more complete graph data. Researchers Bisman Singh and Aaron Clauset will share a new (just published!) meta-learning approach that uses a network's own structural features to automatically select the optimal link prediction algorithm! This is a must-attend event for any data scientist or researcher who wants to eliminate exhaustive benchmarking while getting more accurate predictions. The code will be made public, so you can put these insights into practice immediately. 🤓 Ready to really geek out? Register now: https://lnkd.in/g38EfQ2s
·linkedin.com·
Find the best link prediction for your specific graph