
OCaml Sockets
In the previous article, we explored file handles and how they are used in file manipulation. In this article, we will continue the discussion about handles, now focusing on socket handles. However, the fundamental concept remains the same: a Socket Handle is an identifier used by the operating system to distinguish a specific connection. A communication channel managed by the operating system is called a Socket; the identifier that a process uses to reference this channel is known as a socket handle. Using a socket handle is similar to using a file handle. We can request the OS to open a socket, and it will return a socket handle that we can use to send and receive data, and we must close it once we are done using it. Now, instead of interacting with a file, we will be interacting with processes that can be located both locally and remotely, in a transparent manner.
Creating a Socket
The socket has a communication domain that limits the type of communication that can be performed and also determines the format of the socket address.
| name | description |
|---|---|
| PF_UNIX | local communication (Unix) |
| PF_INET | internet communication (IPv4) |
| PF_INET6 | internet communication (IPv6) |
PF_UNIX uses the machine's file system name as the address, limiting communication to processes running on the same machine. On the other hand, PF_INET and PF_INET6 are used for internet communication, where the address is a pair of host and port. This enables communication between processes running on any two machines connected to the internet.
The socket's communication type (communication type), in turn, indicates whether the communication is reliable (no packet loss or data duplication) and also determines how data is sent and received (byte stream or sequence of packets). The communication type restricts the protocol used for data transmission.
| Type | Reliable | Data representation |
|---|---|---|
| SOCK_STREAM | yes | Byte stream |
| SOCK_DGRAM | no | Packets |
| SOCK_RAW | no | Packets |
| SOCK_SEQPACKET | yes | Sequence of packets |
The socket function, located in the Unix module, has the following signature:
1type socket_domain =2 PF_UNIX (** Unix domain *)3 | PF_INET (** Internet domain (IPv4) *)4 | PF_INET6 (** Internet domain (IPv6) *)5(** The type of socket domains. Not all platforms support6 IPv6 sockets (type [PF_INET6]).78 On Windows: [PF_UNIX] not implemented. *)910type socket_type =11 SOCK_STREAM (** Stream socket *)12 | SOCK_DGRAM (** Datagram socket *)13 | SOCK_RAW (** Raw socket *)14 | SOCK_SEQPACKET (** Sequenced packets socket *)15(** The type of socket kinds, specifying the semantics of16 communications. [SOCK_SEQPACKET] is included for completeness,17 but is rarely supported by the OS, and needs system calls that18 are not available in this library. *)1920val socket : socket_domain -> socket_type -> int -> file_descrSo, based on what we have seen so far, we only need to understand the third parameter of the socket function. This
parameter is the protocol that will be used for communication, represented by an integer. The value 0 indicates the selection
of the default protocol for a given communication domain and type (e.g., UDP for SOCK_DGRAM). The numbers for these
protocols can be found in the /etc/protocols file or in the NIS (Network Information Service) protocols table.
1# Internet (IP) protocols2#3# Updated from http://www.iana.org/assignments/protocol-numbers and other4# sources.5# New protocols will be added on request if they have been officially6# assigned by IANA and are not historical.7# If you need a huge list of used numbers please install the nmap package.89ip 0 IP # internet protocol, pseudo protocol number10hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]11icmp 1 ICMP # internet control message protocol12igmp 2 IGMP # Internet Group Management13ggp 3 GGP # gateway-gateway protocol14ipencap 4 IP-ENCAP # IP encapsulated in IP (officially ``IP'')15st 5 ST # ST datagram mode16tcp 6 TCP # transmission control protocol17egp 8 EGP # exterior gateway protocol18igp 9 IGP # any private interior gateway (Cisco)19pup 12 PUP # PARC universal packet protocol20udp 17 UDP # user datagram protocol21hmp 20 HMP # host monitoring protocol22xns-idp 22 XNS-IDP # Xerox NS IDP23rdp 27 RDP # "reliable datagram" protocol24iso-tp4 29 ISO-TP4 # ISO Transport Protocol class 4 [RFC905]25dccp 33 DCCP # Datagram Congestion Control Prot. [RFC4340]26xtp 36 XTP # Xpress Transfer Protocol27ddp 37 DDP # Datagram Delivery Protocol28idpr-cmtp 38 IDPR-CMTP # IDPR Control Message Transport29ipv6 41 IPv6 # Internet Protocol, version 630ipv6-route 43 IPv6-Route # Routing Header for IPv631ipv6-frag 44 IPv6-Frag # Fragment Header for IPv632idrp 45 IDRP # Inter-Domain Routing Protocol33rsvp 46 RSVP # Reservation Protocol34gre 47 GRE # General Routing Encapsulation35esp 50 IPSEC-ESP # Encap Security Payload [RFC2406]36ah 51 IPSEC-AH # Authentication Header [RFC2402]37skip 57 SKIP # SKIP38ipv6-icmp 58 IPv6-ICMP # ICMP for IPv639ipv6-nonxt 59 IPv6-NoNxt # No Next Header for IPv640ipv6-opts 60 IPv6-Opts # Destination Options for IPv641rspf 73 RSPF CPHB # Radio Shortest Path First (officially CPHB)42vmtp 81 VMTP # Versatile Message Transport43eigrp 88 EIGRP # Enhanced Interior Routing Protocol (Cisco)44ospf 89 OSPFIGP # Open Shortest Path First IGP45ax.25 93 AX.25 # AX.25 frames46ipip 94 IPIP # IP-within-IP Encapsulation Protocol47etherip 97 ETHERIP # Ethernet-within-IP Encapsulation [RFC3378]48encap 98 ENCAP # Yet Another IP encapsulation [RFC1241]49# 99 # any private encryption scheme50pim 103 PIM # Protocol Independent Multicast51ipcomp 108 IPCOMP # IP Payload Compression Protocol52vrrp 112 VRRP # Virtual Router Redundancy Protocol [RFC5798]53l2tp 115 L2TP # Layer Two Tunneling Protocol [RFC2661]54isis 124 ISIS # IS-IS over IPv455sctp 132 SCTP # Stream Control Transmission Protocol56fc 133 FC # Fibre Channel57mobility-header 135 Mobility-Header # Mobility Support for IPv6 [RFC3775]58udplite 136 UDPLite # UDP-Lite [RFC3828]59mpls-in-ip 137 MPLS-in-IP # MPLS-in-IP [RFC4023]60manet 138 # MANET Protocols [RFC5498]61hip 139 HIP # Host Identity Protocol62shim6 140 Shim6 # Shim6 Protocol [RFC5533]63wesp 141 WESP # Wrapped Encapsulating Security Payload64rohc 142 ROHC # Robust Header CompressionWith that, we can create a socket for internet communication (IPv4) using the TCP protocol as follows:
let socket = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0The next step is to connect the socket to an address. For that, we use the connect function:
val connect : file_descr -> sockaddr -> unit(** Connect a socket to an address. *)Connecting to an Address
Now, we have been introduced to a new type, sockaddr. This type is used to represent a socket address.
type sockaddr = | ADDR_UNIX of string | ADDR_INET of inet_addr * intADDR_UNIX f is a local socket address, where f is the file name in the file system. ADDR_INET (a,p) is an internet socket address, where a is the IP address and p is the port.
The host function performs a DNS lookup on the name passed as a parameter and returns the IP address associated with the domain.
$ host ocaml.orgocaml.org has address 51.159.83.169ocaml.org mail is handled by 50 fb.mail.gandi.net.ocaml.org mail is handled by 10 spool.mail.gandi.net.We received 1 IP address associated with the domain ocaml.org. It is known that by convention port 80 is used for HTTP.
An equivalent way to perform the same action in OCaml would be:
# string_of_inet_addr (Unix.gethostbyname "ocaml.org").h_addr_list.(0);;- : string = "51.159.83.169"Therefore, we can create our sockaddr as follows:
let ocaml_org_address = let ocaml_org_host_entry = Unix.gethostbyname "ocaml.org" in Unix.ADDR_INET (ocaml_org_host_entry.h_addr_list.(0), 80);;Now, using everything we have seen and adding a few new functions, which I will explain shortly, we have:
1let print_server_response fdin =2 let buffer_size = 4096 in3 let buffer = BytesLabels.create buffer_size in4 let rec copy () = match read fdin buffer 0 buffer_size with5 | 0 -> ()6 | n ->7 let response = Bytes.sub_string buffer 0 n in8 print_string response;9 copy ()10 in11 copy ();;1213let make_friend address =14 let s = socket Unix.PF_INET Unix.SOCK_STREAM 0 in15 try16 connect s address;17 let message = "Olá Mundo! \n" in18 send_substring s message 0 (String.length message) [] |> ignore;19 print_server_response s stdout;20 shutdown s Unix.SHUTDOWN_ALL;21 with exn ->22 close s;23 raise exn;;24;;2526let ocaml_org_address =27 let ocaml_org_host_entry = Unix.gethostbyname "ocaml.org" in28 Unix.ADDR_INET (ocaml_org_host_entry.h_addr_list.(0), 80)29;;The print_server_response function receives a file_descriptor and prints the content received from the server. To do this, it creates a 4096-byte buffer and reads the content from the file_descriptor into the buffer, printing the buffer's content until there is nothing more to read.
The make_friend function creates a socket and tries to connect to an address passed as a parameter. If the connection is successful, it sends the message "Olá Mundo! \n" to the server and prints the received response. Finally, it closes the socket.
I would like to point out the use of the read function:
val read : file_descr -> bytes -> int -> int -> int(** [read fd buf pos len] reads [len] bytes from descriptor [fd], storing them in byte sequence [buf], starting at position [pos] in [buf]. Return the number of bytes actually read. *)The read function reads up to len bytes from the file_descr fd and stores them in the buffer buf starting at position pos. It returns the number of bytes read. If the function returns 0, it means there is no more data to be read. This is exactly the logic we applied in the print_server_response function.
To call the make_friend function, simply pass the server address as a parameter:
make_friend ocaml_org_address- HTTP/1.1 400 Bad Request- Content-Type: text/plain; charset=utf-8- Connection: close- 400 Bad Request- : unit = ()Well, at least we made a connection to the server. We found someone out there and sent a message; they responded. What the response means is that they did not understand our message. The response starts with "HTTP/1.1", and that is the topic of our next article.
Conclusion
In this article, we explored the creation of sockets and connecting to an address. We saw how to create a socket for internet communication (IPv4) using the TCP protocol and how to connect to an address. Additionally, we saw how to create a sockaddr and how to perform a DNS lookup to obtain the IP address associated with a domain.