
In the previous article, we explored file handles and how they are used in file manipulation. In this article, we will continue the discussion about handles, now focusing on socket handles. However, the fundamental concept remains the same: a Socket Handle is an identifier used by the operating system to distinguish a specific connection. A communication channel managed by the operating system is called a Socket; the identifier that a process uses to reference this channel is known as a socket handle. Using a socket handle is similar to using a file handle. We can request the OS to open a socket, and it will return a socket handle that we can use to send and receive data, and we must close it once we are done using it. Now, instead of interacting with a file, we will be interacting with processes that can be located both locally and remotely, in a transparent manner.
Creating a Socket
The socket has a communication domain that limits the type of communication that can be performed and also determines the format of the socket address.
| name | description |
|---|---|
| PF_UNIX | local communication (Unix) |
| PF_INET | internet communication (IPv4) |
| PF_INET6 | internet communication (IPv6) |
PF_UNIX uses the machine's file system name as the address, limiting communication to processes running on the same machine. On the other hand, PF_INET and PF_INET6 are used for internet communication, where the address is a pair of host and port. This enables communication between processes running on any two machines connected to the internet.
The socket's communication type (communication type), in turn, indicates whether the communication is reliable (no packet loss or data duplication) and also determines how data is sent and received (byte stream or sequence of packets). The communication type restricts the protocol used for data transmission.
| Type | Reliable | Data representation |
|---|---|---|
| SOCK_STREAM | yes | Byte stream |
| SOCK_DGRAM | no | Packets |
| SOCK_RAW | no | Packets |
| SOCK_SEQPACKET | yes | Sequence of packets |
The socket function, located in the Unix module, has the following signature:
type socket_domain =
PF_UNIX (** Unix domain *)
| PF_INET (** Internet domain (IPv4) *)
| PF_INET6 (** Internet domain (IPv6) *)
(** The type of socket domains. Not all platforms support
IPv6 sockets (type [PF_INET6]).
On Windows: [PF_UNIX] not implemented. *)
type socket_type =
SOCK_STREAM (** Stream socket *)
| SOCK_DGRAM (** Datagram socket *)
| SOCK_RAW (** Raw socket *)
| SOCK_SEQPACKET (** Sequenced packets socket *)
(** The type of socket kinds, specifying the semantics of
communications. [SOCK_SEQPACKET] is included for completeness,
but is rarely supported by the OS, and needs system calls that
are not available in this library. *)
val socket : socket_domain -> socket_type -> int -> file_descr
So, based on what we have seen so far, we only need to understand the third parameter of the socket function. This
parameter is the protocol that will be used for communication, represented by an integer. The value 0 indicates the selection
of the default protocol for a given communication domain and type (e.g., UDP for SOCK_DGRAM). The numbers for these
protocols can be found in the /etc/protocols{:txt} file or in the NIS (Network Information Service) protocols table.
# Internet (IP) protocols
#
# Updated from http://www.iana.org/assignments/protocol-numbers and other
# sources.
# New protocols will be added on request if they have been officially
# assigned by IANA and are not historical.
# If you need a huge list of used numbers please install the nmap package.
ip 0 IP # internet protocol, pseudo protocol number
hopopt 0 HOPOPT # IPv6 Hop-by-Hop Option [RFC1883]
icmp 1 ICMP # internet control message protocol
igmp 2 IGMP # Internet Group Management
ggp 3 GGP # gateway-gateway protocol
ipencap 4 IP-ENCAP # IP encapsulated in IP (officially ``IP'')
st 5 ST # ST datagram mode
tcp 6 TCP # transmission control protocol
egp 8 EGP # exterior gateway protocol
igp 9 IGP # any private interior gateway (Cisco)
pup 12 PUP # PARC universal packet protocol
udp 17 UDP # user datagram protocol
hmp 20 HMP # host monitoring protocol
xns-idp 22 XNS-IDP # Xerox NS IDP
rdp 27 RDP # "reliable datagram" protocol
iso-tp4 29 ISO-TP4 # ISO Transport Protocol class 4 [RFC905]
dccp 33 DCCP # Datagram Congestion Control Prot. [RFC4340]
xtp 36 XTP # Xpress Transfer Protocol
ddp 37 DDP # Datagram Delivery Protocol
idpr-cmtp 38 IDPR-CMTP # IDPR Control Message Transport
ipv6 41 IPv6 # Internet Protocol, version 6
ipv6-route 43 IPv6-Route # Routing Header for IPv6
ipv6-frag 44 IPv6-Frag # Fragment Header for IPv6
idrp 45 IDRP # Inter-Domain Routing Protocol
rsvp 46 RSVP # Reservation Protocol
gre 47 GRE # General Routing Encapsulation
esp 50 IPSEC-ESP # Encap Security Payload [RFC2406]
ah 51 IPSEC-AH # Authentication Header [RFC2402]
skip 57 SKIP # SKIP
ipv6-icmp 58 IPv6-ICMP # ICMP for IPv6
ipv6-nonxt 59 IPv6-NoNxt # No Next Header for IPv6
ipv6-opts 60 IPv6-Opts # Destination Options for IPv6
rspf 73 RSPF CPHB # Radio Shortest Path First (officially CPHB)
vmtp 81 VMTP # Versatile Message Transport
eigrp 88 EIGRP # Enhanced Interior Routing Protocol (Cisco)
ospf 89 OSPFIGP # Open Shortest Path First IGP
ax.25 93 AX.25 # AX.25 frames
ipip 94 IPIP # IP-within-IP Encapsulation Protocol
etherip 97 ETHERIP # Ethernet-within-IP Encapsulation [RFC3378]
encap 98 ENCAP # Yet Another IP encapsulation [RFC1241]
# 99 # any private encryption scheme
pim 103 PIM # Protocol Independent Multicast
ipcomp 108 IPCOMP # IP Payload Compression Protocol
vrrp 112 VRRP # Virtual Router Redundancy Protocol [RFC5798]
l2tp 115 L2TP # Layer Two Tunneling Protocol [RFC2661]
isis 124 ISIS # IS-IS over IPv4
sctp 132 SCTP # Stream Control Transmission Protocol
fc 133 FC # Fibre Channel
mobility-header 135 Mobility-Header # Mobility Support for IPv6 [RFC3775]
udplite 136 UDPLite # UDP-Lite [RFC3828]
mpls-in-ip 137 MPLS-in-IP # MPLS-in-IP [RFC4023]
manet 138 # MANET Protocols [RFC5498]
hip 139 HIP # Host Identity Protocol
shim6 140 Shim6 # Shim6 Protocol [RFC5533]
wesp 141 WESP # Wrapped Encapsulating Security Payload
rohc 142 ROHC # Robust Header Compression
With that, we can create a socket for internet communication (IPv4) using the TCP protocol as follows:
let socket = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0
The next step is to connect the socket to an address. For that, we use the connect function:
val connect : file_descr -> sockaddr -> unit
(** Connect a socket to an address. *)
Connecting to an Address
Now, we have been introduced to a new type, sockaddr. This type is used to represent a socket address.
type sockaddr =
| ADDR_UNIX of string
| ADDR_INET of inet_addr * int
ADDR_UNIX f is a local socket address, where f is the file name in the file system. ADDR_INET (a,p) is an internet socket address, where a is the IP address and p is the port.
The host function performs a DNS lookup on the name passed as a parameter and returns the IP address associated with the domain.
$ host ocaml.org
ocaml.org has address 51.159.83.169
ocaml.org mail is handled by 50 fb.mail.gandi.net.
ocaml.org mail is handled by 10 spool.mail.gandi.net.
We received 1 IP address associated with the domain ocaml.org. It is known that by convention port 80 is used for HTTP.
An equivalent way to perform the same action in OCaml would be:
# string_of_inet_addr (Unix.gethostbyname "ocaml.org").h_addr_list.(0);;
- : string = "51.159.83.169"
Therefore, we can create our sockaddr as follows:
let ocaml_org_address =
let ocaml_org_host_entry = Unix.gethostbyname "ocaml.org" in
Unix.ADDR_INET (ocaml_org_host_entry.h_addr_list.(0), 80)
;;
Now, using everything we have seen and adding a few new functions, which I will explain shortly, we have:
let print_server_response fdin =
let buffer_size = 4096 in
let buffer = BytesLabels.create buffer_size in
let rec copy () = match read fdin buffer 0 buffer_size with
| 0 -> ()
| n ->
let response = Bytes.sub_string buffer 0 n in
print_string response;
copy ()
in
copy ();;
let make_friend address =
let s = socket Unix.PF_INET Unix.SOCK_STREAM 0 in
try
connect s address;
let message = "Olá Mundo! \n" in
send_substring s message 0 (String.length message) [] |> ignore;
print_server_response s stdout;
shutdown s Unix.SHUTDOWN_ALL;
with exn ->
close s;
raise exn;;
;;
let ocaml_org_address =
let ocaml_org_host_entry = Unix.gethostbyname "ocaml.org" in
Unix.ADDR_INET (ocaml_org_host_entry.h_addr_list.(0), 80)
;;
The print_server_response function receives a file_descriptor and prints the content received from the server. To do this, it creates a 4096-byte buffer and reads the content from the file_descriptor into the buffer, printing the buffer's content until there is nothing more to read.
The make_friend function creates a socket and tries to connect to an address passed as a parameter. If the connection is successful, it sends the message "Olá Mundo! \n" to the server and prints the received response. Finally, it closes the socket.
I would like to point out the use of the read function:
val read : file_descr -> bytes -> int -> int -> int
(** [read fd buf pos len] reads [len] bytes from descriptor [fd],
storing them in byte sequence [buf], starting at position [pos] in
[buf]. Return the number of bytes actually read. *)
The read function reads up to len bytes from the file_descr fd and stores them in the buffer buf starting at position pos. It returns the number of bytes read. If the function returns 0, it means there is no more data to be read. This is exactly the logic we applied in the print_server_response function.
To call the make_friend function, simply pass the server address as a parameter:
make_friend ocaml_org_address
- HTTP/1.1 400 Bad Request
- Content-Type: text/plain; charset=utf-8
- Connection: close
- 400 Bad Request- : unit = ()
Well, at least we made a connection to the server. We found someone out there and sent a message; they responded. What the response means is that they did not understand our message. The response starts with "HTTP/1.1", and that is the topic of our next article.
Conclusion
In this article, we explored the creation of sockets and connecting to an address. We saw how to create a socket for internet communication (IPv4) using the TCP protocol and how to connect to an address. Additionally, we saw how to create a sockaddr and how to perform a DNS lookup to obtain the IP address associated with a domain.