Public

451 bookmarks

Custom sorting

Stochastic gradient descent - Wikipedia

Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data). Especially in high-dimensional optimization problems this reduces the very high computational burden, achieving faster iterations in trade for a lower convergence rate

#ai #mathematic

·en.wikipedia.org·Nov 14, 2022

Stochastic gradient descent - Wikipedia

Differentiable function - Wikipedia

In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non-vertical tangent line at each interior point in its domain. A differentiable function is smooth (the function is locally well approximated as a linear function at each interior point) and does not contain any break, angle, or cusp.

#mathematic

·en.wikipedia.org·Oct 31, 2022

Differentiable function - Wikipedia

Statistical hypothesis testing - Wikipedia

A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.

#mathematic

·en.wikipedia.org·Oct 29, 2022

Statistical hypothesis testing - Wikipedia

p-value - Wikipedia

In null-hypothesis significance testing, the p-value[note 1] is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct.[2][3] A very small p-value means that such an extreme observed outcome would be very unlikely under the null hypothesis. Reporting p-values of statistical tests is common practice in academic publications of many quantitative fields. Since the precise meaning of p-value is hard to grasp, misuse is widespread and has been a major topic in metascience.

#mathematic

·en.wikipedia.org·Oct 29, 2022

p-value - Wikipedia

Null hypothesis - Wikipedia

In inferential statistics, the null hypothesis (often denoted H0)[1] is that two possibilities are the same. The null hypothesis is that the observed difference is due to chance alone. Using statistical tests, it is possible to calculate the likelihood that the null hypothesis is true.

#mathematic

·en.wikipedia.org·Oct 29, 2022

Null hypothesis - Wikipedia

Second partial derivative test - Wikipedia

In mathematics, the second partial derivative test is a method in multivariable calculus used to determine if a critical point of a function is a local minimum, maximum or saddle point.

#mathematic

·en.wikipedia.org·Oct 30, 2022

Second partial derivative test - Wikipedia

Softmax function - Wikipedia

The softmax function, also known as softargmax[1]: 184 or normalized exponential function,[2]: 198 converts a vector of K real numbers into a probability distribution of K possible outcomes. It is a generalization of the logistic function to multiple dimensions, and used in multinomial logistic regression. The softmax function is often used as the last activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes, based on Luce's choice axiom.

#mathematic

·en.wikipedia.org·Oct 27, 2022

Softmax function - Wikipedia

Multinomial logistic regression - Wikipedia

In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes.[1] That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).

#mathematic #probability

·en.wikipedia.org·Oct 27, 2022

Multinomial logistic regression - Wikipedia

Cross product - Wikipedia

#mathematic

·en.wikipedia.org·Oct 26, 2022

Cross product - Wikipedia

Elementary matrix - Wikipedia

In mathematics, an elementary matrix is a matrix which differs from the identity matrix by one single elementary row operation. The elementary matrices generate the general linear group GLn(F) when F is a field. Left multiplication (pre-multiplication) by an elementary matrix represents elementary row operations, while right multiplication (post-multiplication) represents elementary column operations. Elementary row operations are used in Gaussian elimination to reduce a matrix to row echelon form. They are also used in Gauss–Jordan elimination to further reduce the matrix to reduced row echelon form.

#mathematic

·en.wikipedia.org·Oct 26, 2022

Elementary matrix - Wikipedia

Shear matrix - Wikipedia

In mathematics, a shear matrix or transvection is an elementary matrix that represents the addition of a multiple of one row or column to another. Such a matrix may be derived by taking the identity matrix and replacing one of the zero elements with a non-zero value. The name shear reflects the fact that the matrix represents a shear transformation. Geometrically, such a transformation takes pairs of points in a vector space that are purely axially separated along the axis whose row in the matrix contains the shear element, and effectively replaces those pairs by pairs whose separation is no longer purely axial but has two vector components. Thus, the shear axis is always an eigenvector of S.

#mathematic

·en.wikipedia.org·Oct 26, 2022

Shear matrix - Wikipedia

Orthogonality (mathematics) - Wikipedia

In mathematics, orthogonality is the generalization of the geometric notion of perpendicularity to the linear algebra of bilinear forms. Two elements u and v of a vector space with bilinear form B are orthogonal when B(u, v) = 0. Depending on the bilinear form, the vector space may contain nonzero self-orthogonal vectors. In the case of function spaces, families of orthogonal functions are used to form a basis. The concept has been used in the context of orthogonal functions, orthogonal polynomials, and combinatorics.

#mathematic

·en.wikipedia.org·Oct 26, 2022

Orthogonality (mathematics) - Wikipedia

Orthogonality - Wikipedia

#mathematic

·en.wikipedia.org·Oct 26, 2022

Orthogonality - Wikipedia

Basis (linear algebra) - Wikipedia

In mathematics, a set B of vectors in a vector space V is called a basis if every element of V may be written in a unique way as a finite linear combination of elements of B. The coefficients of this linear combination are referred to as components or coordinates of the vector with respect to B. The elements of a basis are called basis vectors. Equivalently, a set B is a basis if its elements are linearly independent and every element of V is a linear combination of elements of B.[1] In other words, a basis is a linearly independent spanning set. A vector space can have several bases; however all the bases have the same number of elements, called the dimension of the vector space.

#mathematic

·en.wikipedia.org·Oct 25, 2022

Basis (linear algebra) - Wikipedia

Topology - Wikipedia

In mathematics, topology (from the Greek words τόπος, 'place, location', and λόγος, 'study') is concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling, and bending; that is, without closing holes, opening holes, tearing, gluing, or passing through itself

#mathematic

·en.wikipedia.org·Oct 23, 2022

Topology - Wikipedia

Linear span - Wikipedia

In mathematics, the linear span (also called the linear hull[1] or just span) of a set S of vectors (from a vector space), denoted span(S),[2] is the smallest linear subspace that contains the set.[3] It can be characterized either as the intersection of all linear subspaces that contain S, or as the set of linear combinations of elements of S. The linear span of a set of vectors is therefore a vector space. Spans can be generalized to matroids and modules.

#mathematic

·en.wikipedia.org·Oct 24, 2022

Linear span - Wikipedia

Linear subspace - Wikipedia

In mathematics, and more specifically in linear algebra, a linear subspace, also known as a vector subspace[1][note 1] is a vector space that is a subset of some larger vector space. A linear subspace is usually simply called a subspace when the context serves to distinguish it from other types of subspaces.

#mathematic

·en.wikipedia.org·Oct 24, 2022

Linear subspace - Wikipedia

Bigram - Wikipedia

A bigram or digram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. A bigram is an n-gram for n=2. The frequency distribution of every bigram in a string is commonly used for simple statistical analysis of text in many applications, including in computational linguistics, cryptography, speech recognition, and so on. Gappy bigrams or skipping bigrams are word pairs which allow gaps (perhaps avoiding connecting words, or allowing some simulation of dependencies, as in a dependency grammar). Head word bigrams are gappy bigrams with an explicit dependency relationship.

#ai

·en.wikipedia.org·Oct 20, 2022

Bigram - Wikipedia

Chain rule formula

#mathematic

·wikimedia.org·Oct 20, 2022

Chain rule formula

Jacobian matrix and determinant formula

#mathematic

·wikimedia.org·Oct 20, 2022

Jacobian matrix and determinant formula

Determinant formula

#mathematic

·wikimedia.org·Oct 20, 2022

Determinant formula

Determinant - Wikipedia

In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It allows characterizing some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism. The determinant of a product of matrices is the product of their determinants (the preceding property is a corollary of this one). The determinant of a matrix A is denoted det(A), det A, or |A|.

#mathematic

·en.wikipedia.org·Oct 20, 2022

Determinant - Wikipedia

Jacobian matrix and determinant - Wikipedia

In vector calculus, the Jacobian matrix (/dʒəˈkoʊbiən/,[1][2][3] /dʒɪ-, jɪ-/) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and (if applicable) the determinant are often referred to simply as the Jacobian in literature

#mathematic

·en.wikipedia.org·Oct 20, 2022

Jacobian matrix and determinant - Wikipedia

Derivative - Wikipedia

In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. For example, the derivative of the position of a moving object with respect to time is the object's velocity: this measures how quickly the position of the object changes when time advances

#mathematic

·en.wikipedia.org·Oct 20, 2022

Derivative - Wikipedia

Recursive definition - Wikipedia

In mathematics and computer science, a recursive definition, or inductive definition, is used to define the elements in a set in terms of other elements in the set (Aczel 1977:740ff). Some examples of recursively-definable objects include factorials, natural numbers, Fibonacci numbers, and the Cantor ternary set.

#mathematic #programming

·en.wikipedia.org·Oct 18, 2022

Recursive definition - Wikipedia

Natural Language Inference: An Overview

Benchmark and models

Natural Language Inference (NLI) is the task of determining whether the given “hypothesis” logically follows from the “premise”. In layman’s terms, you need to understand whether the hypothesis is true, while the premise is your only knowledge about the subject.

#ai

·towardsdatascience.com·Oct 18, 2022

Natural Language Inference: An Overview

Natural-language understanding - Wikipedia

Natural-language understanding (NLU) or natural-language interpretation (NLI)[1] is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension. Natural-language understanding is considered an AI-hard problem.[2] There is considerable commercial interest in the field because of its application to automated reasoning,[3] machine translation,[4] question answering,[5] news-gathering, text categorization, voice-activation, archiving, and large-scale content analysis.

#ai

·en.wikipedia.org·Oct 18, 2022

Natural-language understanding - Wikipedia

Dynamic programming - Wikipedia

Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics. In both contexts it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner

#programming

·en.wikipedia.org·Oct 17, 2022

Dynamic programming - Wikipedia

AI alignment - Wikipedia

In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards their designers’ intended goals and interests.[a] An AI system is described as misaligned if it is competent but advances an unintended objective

#ai #ai-alignment

·en.wikipedia.org·Oct 17, 2022

AI alignment - Wikipedia

Tokenization

Given a character sequence and a defined document unit, tokenization is the task of chopping it up into pieces, called tokens , perhaps at the same time throwing away certain characters, such as punctuation

#ai

·nlp.stanford.edu·Oct 17, 2022

Tokenization