ISI Digital Commons (Indian Statistical Institute )
Not a member yet
7571 research outputs found
Sort by
Graph Neural Networks for Homogeneous and Heterogeneous Graphs: Algorithms and Applications
A graph is used to represent complex systems where both entities and their interconnections are equally important. Real-life situations, e.g., social networks, biological networks, recommender systems, etc., are better modeled in terms of graphical structures, as the information about individual entities is not enough to understand the whole system. Due to the existence of non-uniformity in graphical data, traditional machine learning algorithms that perform tasks like prediction, classification, etc., can not be applied directly to such data. Graph Neural Networks (GNNs) are robust variants of deep neural network models that are typically designed to learn from such graphical data. GNN involves transforming graph data into Euclidean representations that various machine-learning algorithms can utilize. In this thesis, two types of graphs have been studied. In the first two contributory chapters, the graphs considered are homogeneous, where all nodes are of the same type. Chapter 2 describes a model called Interval-Valued Graph Neural Network (IV-GNN), which has been developed to handle homogeneous graphs with interval-valued node features. This model relaxes the restriction that the node features should be singlevalued. Here, interval-valued features are allowed, and the corresponding GNN model, along with its mathematical analysis, is presented. Chapter 3 discusses the importance of hierarchical structure learning within a graph. It describes a model called GraMMy, which is designed for hierarchical semantics-driven graph representation learning based on Micro-Macro analysis. It focuses on the graph at different levels of abstraction to allow the flexible flow of information between the higher-order neighborhoods. The task that we aim to perform on the homogeneous graphs in Chapter 2 and 3 is graph classification. The second part of the thesis deals with heterogeneous graphs. We consider the social recommender system as an area of application. We have modeled the problem of predicting missing rating value for a user to an item as a link prediction task in a heterogeneous graph setting where multiple types of nodes are present in the data. In our third contribution (Chapter 4), the aim is to quantify the usefulness of the ratings given by the user to an item. For this purpose, a metric called Influence Score of a user has been defined and incorporated into a GNN-based recommender system to develop a Social Influence-aware recommendation system, SInGER. Although SInGER improves the prediction quality, a limitation of the approach is the uniform definition of the Influence Score, irrespective of the data set considered. To overcome this, in the fourth work (Chapter 5), we develop a neural architecture to capA graph is used to represent complex systems where both entities and their interconnections are equally important. Real-life situations, e.g., social networks, biological networks, recommender systems, etc., are better modeled in terms of graphical structures, as the information about individual entities is not enough to understand the whole system. Due to the existence of non-uniformity in graphical data, traditional machine learning algorithms that perform tasks like prediction, classification, etc., can not be applied directly to such data. Graph Neural Networks (GNNs) are robust variants of deep neural network models that are typically designed to learn from such graphical data. GNN involves transforming graph data into Euclidean representations that various machine-learning algorithms can utilize. In this thesis, two types of graphs have been studied. In the first two contributory chapters, the graphs considered are homogeneous, where all nodes are of the same type. Chapter 2 describes a model called Interval-Valued Graph Neural Network (IV-GNN), which has been developed to handle homogeneous graphs with interval-valued node features. This model relaxes the restriction that the node features should be singlevalued. Here, interval-valued features are allowed, and the corresponding GNN model, along with its mathematical analysis, is presented. Chapter 3 discusses the importance of hierarchical structure learning within a graph. It describes a model called GraMMy, which is designed for hierarchical semantics-driven graph representation learning based on Micro-Macro analysis. It focuses on the graph at different levels of abstraction to allow the flexible flow of information between the higher-order neighborhoods. The task that we aim to perform on the homogeneous graphs in Chapter 2 and 3 is graph classification. The second part of the thesis deals with heterogeneous graphs. We consider the social recommender system as an area of application. We have modeled the problem of predicting missing rating value for a user to an item as a link prediction task in a heterogeneous graph setting where multiple types of nodes are present in the data. In our third contribution (Chapter 4), the aim is to quantify the usefulness of the ratings given by the user to an item. For this purpose, a metric called Influence Score of a user has been defined and incorporated into a GNN-based recommender system to develop a Social Influence-aware recommendation system, SInGER. Although SInGER improves the prediction quality, a limitation of the approach is the uniform definition of the Influence Score, irrespective of the data set considered. To overcome this, in the fourth work (Chapter 5), we develop a neural architecture to capA graph is used to represent complex systems where both entities and their interconnections are equally important. Real-life situations, e.g., social networks, biological networks, recommender systems, etc., are better modeled in terms of graphical structures, as the information about individual entities is not enough to understand the whole system. Due to the existence of non-uniformity in graphical data, traditional machine learning algorithms that perform tasks like prediction, classification, etc., can not be applied directly to such data. Graph Neural Networks (GNNs) are robust variants of deep neural network models that are typically designed to learn from such graphical data. GNN involves transforming graph data into Euclidean representations that various machine-learning algorithms can utilize. In this thesis, two types of graphs have been studied. In the first two contributory chapters, the graphs considered are homogeneous, where all nodes are of the same type. Chapter 2 describes a model called Interval-Valued Graph Neural Network (IV-GNN), which has been developed to handle homogeneous graphs with interval-valued node features. This model relaxes the restriction that the node features should be singlevalued. Here, interval-valued features are allowed, and the corresponding GNN model, along with its mathematical analysis, is presented. Chapter 3 discusses the importance of hierarchical structure learning within a graph. It describes a model called GraMMy, which is designed for hierarchical semantics-driven graph representation learning based on Micro-Macro analysis. It focuses on the graph at different levels of abstraction to allow the flexible flow of information between the higher-order neighborhoods. The task that we aim to perform on the homogeneous graphs in Chapter 2 and 3 is graph classification. The second part of the thesis deals with heterogeneous graphs. We consider the social recommender system as an area of application. We have modeled the problem of predicting missing rating value for a user to an item as a link prediction task in a heterogeneous graph setting where multiple types of nodes are present in the data. In our third contribution (Chapter 4), the aim is to quantify the usefulness of the ratings given by the user to an item. For this purpose, a metric called Influence Score of a user has been defined and incorporated into a GNN-based recommender system to develop a Social Influence-aware recommendation system, SInGER. Although SInGER improves the prediction quality, a limitation of the approach is the uniform definition of the Influence Score, irrespective of the data set considered. To overcome this, in the fourth work (Chapter 5), we develop a neural architecture to capture user trust without explicitly defining it. It provides an effective means of implicitly accounting for trust propagation and composability while performing GNN-based analyses to accomplish the overall task of item rating prediction
Crystallization of the quantized function algebras of SUq(n + 1)
The -deformation of a connected, simply connected Lie group is typically studied through two Hopf algebras associated with it: the quantized universal enveloping algebra and the quantized function algebra . If has a compact real form , one can use the Cartan involution to give a -structure on . The QFA with this structure is denoted by and its -completion by . Here we study the crystal limits of and and classify all irreducible representations of the crystallized algebras. We also prove that the crystallized algebra carries a natural bialgebra structure
Essays in Individual and Collective Choice
This thesis consists of three chapters that address problems in social choice theory, fair division of a heterogeneous good and choosing a pair of complementary goods, respectively. The first chapter deals with designing a voting mechanism. There are a set of finite alternatives arranged according to an exogenous order. The output of the mechanism must be a (fixed-cardinality) set of contiguous alternatives, that are referred to as intervals. We find a mechanism that is Pareto efficient and strategy-proof. In the second chapter we study the division of a heterogeneous resource. Each agent must be allocated a continuous (interval) subset of the interval [0,1]. The agents have preferences that are single-peaked in one dimension (quantity) but not in another (location). We characterize the full set of Pareto efficient and envy-free (i.e. fair) allocations. We also show that there is no rule that is strategy-proof, envy-free and Pareto efficient. In the third chapter we propose a heuristic on how a decision-maker (DM) might choose a pair of complementary goods, one alternative each from two sets. The modeling of complementarity is done without relying on prices or utility functions. The choices of the DM are represented by joint choice functions. We first define the concepts of weak and strong-complements. Further we characterize what we call weak-complements choice functions and then provide necessary and sufficient conditions for the existence of strong-complements choice functions
Enhancing Medical Image Analysis through Deep Learning: A Comprehensive Study on Classification, Segmentation, and Multitask Learning
Medical image analysis has become indispensable for accurate diagnosis and treatment planning. However, despite advances in deep learning, several critical challenges persist, ranging from more efficient models to the integration of multiple tasks within a unified framework. This thesis addresses these challenges by proposing innovative deep learn- ing architectures that enhance medical image classification, segmentation, and multitask learning. At the heart of this research is the goal of developing models that deliver high performance and tackle the nuanced complexities of medical data. Existing clas- sification models often overlook valuable information hidden in the spectral domain of images. I address this by integrating spatial and spectral features, demonstrating their complementary power to detect diseases such as COVID-19 from chest radiographs. This approach facilitates a more holistic understanding of medical images, improving the ac- curacy and reliability of diagnostic systems. To further enhance image classification, I explore hybrid architectures that combine convolutional and transformer-based models. These models leverage the strengths of both architectures, capturing fine-grained visual details and long-range dependencies. This significantly improves various medical imaging datasets, offering deeper interpretability and superior classification accuracy, particularly in complex diagnostic scenarios. Moving beyond classification, I tackle the fundamen- tal challenge of segmenting complex and irregular regions within medical images, where traditional deep learning models often struggle. To overcome this, I introduce a novel segmentation framework that combines the power of deep neural networks with trainable morphological operations. This leads to a more precise delineation of regions of inter- est, even in challenging clinical scenarios, setting a new benchmark for medical image segmentation. One of the most pressing issues in medical imaging is the inefficiency of current multitask learning models, which often require vast computational resources and struggle to generalize across different tasks. I present a lightweight multitask learn- ing framework that excels at both segmentation and classification, particularly in breast tumor analysis. Using novel morphological attention mechanisms and the sharing of task- specific knowledge, proposed model significantly reduces computational complexity while improving performance. Importantly, this framework demonstrates versatility across various medical imaging domains, from gland segmentation and malignancy detection in histology images to skin lesion analysis, demonstrating its robustness and applicability in real-world settings. Altogether, this thesis offers solutions to some of the most pressing problems in medical image analysis, providing models that are not only more accurate but also computationally efficient, making them suitable for deployment in clinical practice
Problems in Affine Algebraic Geometry: on Tririality and Embedding of Linear Hyperplanes and Rigidity of Pham-Brieskorn surfaces
The thesis consists of two topics from Affine Algebraic Geometry: one of the topics is on Linear varieties (varieties defined by polynomials which are linear in one variable) and the second topic explores when Pham-Brieskorn surfaces do not admit non-trivial Ga-actions. Linear varieties over a field k have been playing a central role in the study of some of the challenging problems on affine spaces like the Zariski Cancellation Problem and the Linearization Problem. Breakthroughs on such problems have occurred by examining two questions on linear polynomials of the form H := α(X1, . . . , Xm)Y − F (X1, . . . , Xm, Z, T ) ∈ D := k[X1, . . . , Xm, Y, Z, T ] : (i) Whether H defines a hyperplane i.e., the affine variety V ∈ Am+3 k defined by H is isomorphic to the affine space Am+2 k . (ii) If V is isomorphic to an affine space, then whether H is a coordinate in D. Question (i) connects to the Characterization Problem of identifying affine spaces among affine varieties; Question (ii) is a special case of the formidable Embedding Problem for affine spaces. In Chapter 3 of the thesis, using K-theory and Ga-actions, we address these questions under certain conditions on α and F . For instance, we show that when the characteristic of k is zero, F ∈ k[Z, T ] and H defines a hyperplane, then H is a coordinate in D along with X1, X2, . . . , Xm. Our results yield certain family of higher-dimensional hyperplanes satisfying the Abhyankar–Sathaye Conjecture on the Epimorphism Problem and an infinite family of higher-dimensional non-isomorphic varieties which are counterexamples to the Zariski Cancellation Problem in positive characteristic and the A2-fibration Problem in positive characteristic. We have also discussed the above two questions by replacing the field k with a Noetherian integral domain R. In Chapter 4 of the thesis, we have discussed the rigidity of Pham-Brieskorn rings. Over any field k, for n ∈ Z\u3e3 and a1, . . . , an ∈ Z\u3e1, Pham-Brieskorn rings are denoted by B(a1,...,an) and defined by B(a1,...,an) := k[X1, . . . , Xn]/(Xa1 1 + · · · + Xan n ). We showed that every non-domain Pham-Brieskorn ring, for n ∈ Z\u3e3 is non-rigid. For any three integers a, b, c \u3e 1, we give some sufficient conditions on (a, b, c) for which Pham- Brieskorn domain B(a,b,c) is rigid. This gives an alternative approach to show that over a field k of characteristic p \u3e 0, there does not exist any non-trivial exponential map on k[X, Y, Z, T ]/(XmY + T pr q + Zpe ) = k[x, y, z, t], for m, q \u3e 1, p - mq and e \u3e r \u3e 1, which fixes y, a crucial result used in “On the cancellation problem for the affine space A3 in characteristic p, Invent. Math. 195” by Neena Gupta to show that the Zariski Cancellation Problem does not hold for the affine 3-space. We also provide a sufficient condition for B(a,b,c) to be stably rigid
Provable Security in Idealised Models
This thesis is a compilation of provable security analyses of various cryptographic constructions in idealised models. The first construction examined is the ABR hash. We revisit the existing proof of the ABR hash in the random oracle model and identify significant errors in the proof. Although we are unable to correct the original proof, we establish the security of the ABR tree of height 3 from scratch, addressing the first non-trivial case. As our second contribution, we conduct a tight and comprehensive security analysis of the Ascon AEAD mode in the random permutation model. We show that the efficiency of Ascon can be increased by 50%, and the tag size can be halved without losing any security. In the third contribution, we extend our security analysis of Ascon to the multiuser setting, providing tight security bounds for both nonce-respecting and noncemisuse adversaries. Additionally, we propose LK-Ascon, a variant of Ascon with a key size of up to 256 bits, offering improved multi-user security compared to Ascon. As the final contribution, we introduce PACT, a transform that converts any authenticated encryption mode into a context-committing one without any output length expansion. PACT achieves this with a single call to a collision-resistant unkeyed hash function and one call to a block cipher, with the analysis performed in the ideal cipher model. We also propose comPACT, a faster version of PACT which gives a nonce-respecting committing authenticated encryption scheme
On Zero-Shot Recognition of Unseen State-Object Composition
Compositional Zero-Shot Learning (CZSL) attempts to recognise images of new (unseen) compositions of states and objects, when images of only a subset of stateobject compositions are available as training data. Thus a CZSL model should recognise a young dog when the model has seen images of the state-object compositions young bear, old bear and old dog. There are multiple challenges to solve the CZSL problem. It is difficult to disentangle the visual features of object dog and its state young from its compositional image young dog. The features of a state are observed to have high variation in visual features across compositions. For example, the state sliced has different visual features in compositions sliced apple and sliced tomato. In the second chapter of the thesis, we attempt to disentangle the visual features of state and object using a two-stage sequential recognition approach. In next chapter of the thesis, we work on the open-world CZSL problem where no prior information about the feasibility of a state-object composition is available. We use a Graph Convolutional Network based architecture along with a frequency-based feasibility prediction approach for the open-world CZSL problem. Another challenge in CZSL lies in the fact that the extent of association between the features of a state and an object vary significantly in different images of the same composition. For example, in different images of peeled orange, the oranges may be peeled to a different extent. Thus the visual features of images of peeled orange may vary. In the fourth chapter, a novel Knowledge-guided Transformer Network is proposed to better process the partial association between the visual features of state and object. In the fifth chapter, we attempt the partially supervised CZSL (pCZSL) problem, where for each state-object compositional image, either the state or the object annotation is available. We propose a novel vision transformer based architecture with Locality Preserving Neighbourhood Aggregation approach in the fifth chapter. Effective identification of the discriminative features of state and object often depends on the scale of the object in the image. For example, in the images of the two compositions, young bear and old bear, the identification of the states young and old may depend on recognising the scale (or size) of the object bear in the image. In the sixth chapter, we leverage Vision Language Model (VLM) to estimate the scale-aware features in CZSL. Extensive experiments on C-GQA, MIT-States and UT-Zappos50k datasets demonstrate the effectiveness of the approaches in this thesis, when compared to the stateof- the-art in the closed-world CZSL, open-world CZSL and pCZSL settings. As concluding remarks, we discuss the future scope of research in CZSL.
On Automated Analysis of Lung Images with Deep Learning for Healthcare
Automated detection and diagnosis of lung diseases through medical image analysis offers a noninvasive alternative to invasive procedures, especially considering the challenges and potential risks associated with repeat lung operations. Noninvasive image-guided diagnostic techniques, such as lung imaging, have become essential in clinical practice. This thesis focuses on the development of a computer-aided system aimed at enhancing the classification, detection, and segmentation of lung diseases, specifically caused by COVID-19 and lung tumors, leveraging advanced computational methods. Novel segmentation algorithms, such as EFMC and WDU-Net, are devised based on encoder-decoder architectures within deep convolution networks. These algorithms undergo rigorous validation against ground truth or manual segmentation by radiologists, ensuring their accuracy and reliability. The EFMC algorithm employs a selective focus mechanism with multi-resolution blocks, allowing for precise delineation of COVID-19 affected regions in lung CT scans. Its performance is validated through extensive comparison with expert annotations, demonstrating its effectiveness in capturing subtle abnormalities while accurately segmenting lung anomalies. Similarly, WDU-Net integrates weighted deformable convolution. Here the deformable convolution modules enhance its ability to capture irregular shapes and features in COVID-19 and lung tumors. Validation against manual segmentation reveals its robustness and accuracy in segmenting COVID-19 and lung tumors from CT images; thereby, showcasing its potential for aiding clinical diagnosis and treatment planning. Next automated classification of lung tumors is devised, in the multi-modal PET-CT framework, using the innovative DEMF model. The network leverages deep convolution networks, in conjunction with dimensionality reduction, to efficiently detect and classify lung abnormalities. This demonstrates superior performance in lung cancer classification across multimodal images. Finally, the DGMC is developed to enhance diagnosis and classification of diseases, by co-learning from multimodal images. Utilizing a novel multihead classifier, the DGMC can efficiently distinguish between COVID-19, tumors, and healthy slices of the lung. The input signal encompasses CT, along with EIT-processed CT scans, in order to provide a multimodal flavour. It captures granular details of the infection, while visualizing the activation regions. Together, these advancements represent significant progress in the automated analysis of lung diseases, by providing valuable tools for the early detection and diagnosis in clinical settings
Some Nonparametric Tests for High-Dimensional and Functional Data
The advancement of information technology and sciences over the last few decades has facilitated the collection, storage and analysis of huge data sets. Many of these data sets contain observations having large number of features, and in some cases, this number is comparable to or even much larger than the sample size. Many traditional statistical methods cannot be meaningfully used in such situations. We develop some inferential tools for such high dimensional data. In particular, we consider the two-sample problem and the problem of testing spherical symmetry of a multivariate distributions. We construct some nonparametric tests in these contexts and investigate the limiting behavior of the proposed tests when the dimension diverges to infinity while the sample size may or may not grow with the dimension. Several simulated and real datasets are analyzed to compare their empirical performance with some state-of-the-art methods. In practice, we also encounter situations, where the feature are not scalar or finite-dimensional vectors, but they are functions or curves. We also focus on such functional data sets. We develop a two-sample test for functional data and construct a test for mutual independence among several random functions. Theoretical properties of our proposed tests are investigated under appropriate regularity conditions, and their empirical performance is evaluated by analyzing several simulated and real data sets against some state-of-the-art methods
Dataset for Ranking Social Science Institutions in India through Responsible Research Assessment (RRA)
This dataset supports the study entitled: A Responsible Research Assessment Framework for Ranking Social Science Institutions in India. It comprises diverse research outputs, academic activities, and institutional resources of twenty-four ICSSR-funded institutions across the country. The dataset is primarily extracted from Scopus (a citation database), subsequently validated through Annual Reports of the ICSSR, covering the period from 2013 to 2022. Moreover, high-resolution images of all figures and tables used in the paper have also been shared to support transparency and facilitate detailed examination. The dataset is curated to provide a holistic view of institutional performance in order to showcase the publication productivity, academic activities, resource allocations, prestige values, stakeholder engagement, and societal impact through various quantitative indicators. The dataset enables replication of the study, cross-institutional benchmarking, and policy-driven analysis of research ecosystems. The dataset adheres to FAIR principles and is structured to enable reuse, thereby enhancing reproducibility as an ethical practice for generating better insights. Eventually, this dataset aims to accelerate scientific progress by building upon existing work, which is crucial for advancing open science practices